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Abstract 


Estimation  and  Control  with  Relative  Measurements: 
Algorithms  and  Scaling  Laws 

by 


Prabir  Barooah 


In  this  dissertation  we  examine  a  class  of  estimation  and  control  problems 
involving  interconnected  systems.  These  problems  share  the  common  attribute 
that,  between  two  component  subsystems,  noisy  measurements  of  the  difference 
of  their  states  alone  is  available.  The  estimation  problem  is  relevant  to  sensor  and 
actuator  networks,  and  the  control  problem  is  relevant  to  coordination  in  multi- 
agent  systems.  Both  classes  of  problems  are  defined  over  a  graph  that  is  used  to 
describe  the  interconnections. 

In  the  first  part  of  this  dissertation,  the  estimation  problem  is  examined.  The 
variables  correspond  to  the  nodes  of  a  graph,  and  the  measurements  of  the  noisy 
difference  between  pairs  of  variables  correspond  to  its  edges.  The  task  is  to  com¬ 
pute  estimates  of  the  node  variables  with  respect  to  a  reference  node.  We  begin 
by  designing  distributed  algorithms  to  compute  the  optimal  estimate,  which  refers 
to  the  best  linear  unbiased  estimator  (BLLTE).  We  then  examine  the  effect  of  the 
graph  structure  on  the  minimum  achievable  estimation  error.  Specifically,  we  ex¬ 
amine  how  the  optimal  estimation  error  of  a  node  variable  grows  with  its  distance 
from  the  reference  node.  A  classification  of  graphs  -  sparse  and  dense  in  ID, 2D, 
and  3D  -  is  obtained,  which  determines  the  error  growth  rate:  linear,  logarithmic, 


IX 


or  bounded. 


In  the  second  part  of  this  dissertation,  the  control  of  formations  over  arbitrary 
graphs  is  described.  Specifically,  we  examine  how  the  structure  of  the  intercon¬ 
nection  graph  affects  the  stability  and  sensitivity  to  measurement  noise  of  the 
formation.  The  vehicular  platoon  problem  is  investigated  in  detail  -  especially  the 
decentralized  bidirectional  control  architecture  in  which  each  vehicle  uses  front  and 
back  spacing  measurements  to  compute  its  control  signal.  Fundamental  limita¬ 
tions  in  disturbance  amplification  are  established  for  the  symmetric  bidirectional 
architecture.  Then  we  show  that  arbitrary  small  asymmetry  in  the  front  and 
back  controller  gains  can  lead  to  an  order  of  magnitude  improvement  in  stability 
margin. 

The  underlying  theme  of  our  investigations  is  that  of  performance  degradation 
-  and  possible  amelioration  -  in  interconnected  systems  as  the  the  number  of 
constituent  sub-systems  increases. 
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Chapter  1 


Introduction 


Recent  years  have  seen  the  development  and  proliferation  of  devices  that  are 
equipped  with  embedded  sensing,  processing,  wireless  communication,  and  ac¬ 
tuation  capability.  As  a  result,  it  is  becoming  possible  to  monitor  and  control 
processes  that  are  distributed  over  large  geographical  areas,  by  deploying  a  large 
number  of  such  devices  and  interconnecting  them,  possibly  using  wireless  com¬ 
munication.  The  individual  devices  are  called  nodes ,  and  the  collection  of  such 
devices  deployed  for  sensing  tasks  is  called  a  sensor  network.  The  epithet  network 
refers  to  the  interconnection  between  nodes.  The  New  York  Harbor  Observing 
and  Prediction  System  (NYHOPS)  is  an  example  of  a  sensor  network,  which  con¬ 
sists  of  a  number  of  sensors  connected  by  a  wireless  communication  network  that 
monitor  salinity,  temperature,  turbidity,  and  water  levels  in  the  Hudson  river  es¬ 
tuary  [3].  Another  example  of  a  sensor  network  is  a  group  of  unmanned  aerial 
vehicles  (UAVs)  that  can  be  used  to  collaboratively  detect  and  track  targets  [4], 
When  the  devices  have  actuation  capability  as  well,  the  network  is  called  an 
actuator  network.  More  generally,  a  network  of  devices  with  both  sensing  and 
actuation  capability  is  called  a  sens  or- actuator  network.  An  example  of  sensor- 
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actuator  networks  is  an  irrigation  network  of  water  level  sensors  and  gate  actuators 
that  is  interconnected  over  a  communication  infrastructure  [5].  Another  exam¬ 
ple  of  a  sensor-actuator  network  is  an  Automated  Highway  System  (AHS),  in 
which  a  group  of  autonomous  vehicles  forms  a  platoon,  in  which  every  vehicle 
takes  local  control  action  based  on  on-board  sensor  readings,  so  that  a  constant 
inter- vehicular  separation  is  maintained  [6,  7]. 

Successful  application  of  sensor  and  actuator  networks  requires  tackling  novel 
estimation  and  control  problems.  Typically,  the  nodes  of  a  sensor  and  actuator 
network  are  distributed  in  a  large  spatial  domain.  As  a  result,  only  local  informa¬ 
tion  is  available  to  each  of  the  nodes.  The  nodes  have  to  either  estimate  global 
quantities  of  interest,  or  take  appropriate  control  action,  based  on  locally  mea¬ 
sured  quantities.  A  special  situation  that  arises  in  several  applications  is  that  a 
node  can  measure  relative  quantities,  from  which  it  has  to  estimate  the  absolute 
ones.  For  example,  a  node  maybe  able  to  measure  its  relative  position  with  respect 
to  a  nearby  node  but  not  its  position  in  a  global  coordinate.  The  small  size,  low 
cost,  and  low  energy  budget  of  the  sensor  nodes  that  make  them  so  attractive  for 
a  myriad  of  applications  preclude  these  nodes  having  on-board  GPS  [8].  Yet,  in 
order  that  the  user  of  the  network  can  utilize  the  data  gathered  by  the  sensor  net¬ 
work,  location  information  of  the  data  sources  is  needed.  Therefore,  nodes  need 
to  estimate  their  own  locations  in  a  global  coordinate  frame  from  measurements 
of  relative  positions,  which  may  furthermore  be  corrupted  with  high  levels  of  noise 
due  to  the  limitations  of  the  measurement  techniques. 

In  certain  applications,  estimation  of  global  attributes  may  not  be  needed,  but 
the  nodes  may  need  to  take  appropriate  control  action  based  solely  on  relative 
measurements  in  order  to  achieve  a  common  objective.  For  example,  a  team  of 
UAVs  may  need  to  maintain  a  specific  formation,  while  each  UAV  can  measure 
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only  its  relative  position  with  respect  to  its  nearby  UAVs.  Each  node,  i.e.,  UAV, 
is  required  to  employ  a  decentralized  control  law  that  uses  only  the  local,  relative 
position  measurements.  Even  if  the  dynamics  of  individual  nodes  were  otherwise 
independent  of  each  other,  since  the  control  action  taken  by  one  node  depends 
on  its  relative  position  with  other  nodes,  the  closed-loop  dynamics  of  the  nodes 
become  coupled  with  one  another. 

Fusing  noisy  measurements  obtained  from  a  network  of  nodes  to  produce  ac¬ 
curate  estimates,  when  the  nodes  are  spatially  separated,  communicate  with  one 
another  through  an  unreliable  wireless  medium,  and  have  limited  battery  lives,  is 
a  challenging  task  -  especially  when  the  size  of  the  network  is  large.  Similarly,  de¬ 
vising  local  control  algorithms  for  individual  nodes  of  an  interconnected  network 
of  dynamical  systems,  such  that  the  whole  system  achieves  a  global  objective,  is 
a  difficult  problem,  one  for  which  traditional  design  and  analysis  tools  of  control 
theory  are  not  adequate. 

This  dissertation  investigates  a  few  of  the  estimation  and  control  problems 
that  are  motivated  by  sensor  and  actuator  network  applications.  The  dissertation 
consists  of  two  parts  -  part  I  deals  with  the  estimation  problems  and  part  II, 
with  control.  In  the  following  sections  we  describe  the  problems  examined  in  the 
two  parts,  the  challenges  in  each,  and  briefly  summarize  the  contributions  of  this 
work  in  each  problem  category.  The  contributions  are  listed  chapter-wise  for  easy 
referral.  Each  chapter  contains,  at  its  end,  a  discussion  of  open  issues. 
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1.1  Part  I  :  Estimation  with  relative  measure¬ 


ments 

We  examine  the  problem  of  estimating  n  variables  X2,  ■  ■  ■  xn  from  noisy 
relative  measurements  of  the  form 

Cu,v  =  Xu  -  xv  +  eUtV,  11.  v  e  (1.1) 

where  eu,v  is  zero-mean  measurement-noise,  when  one  or  more  variables  is  assumed 
known.  This  problem  arises  in  several  sensor  network  applications.  The  variables 
are  often  vector-valued. 

A  typical  example  is  localization,  in  which  locations  of  a  number  of  nodes  have 
to  be  estimated  from  relative  position  measurements  of  the  form  (1.1).  The  need 
for  localization  arises  when  the  nodes  cannot  measure  their  positions  directly,  such 
as  when  they  are  not  equipped  with  global  positioning  service  (GPS)  capability. 
However,  certain  pairs  of  nodes  may  be  able  to  measure  their  relative  positions. 
For  example,  two  nodes  u  and  v  that  are  located  at  positions  xu  and  xv  on  a  plane 
may  be  able  to  measure  their  relative  position  xu  —  xv  in  a  common  Cartesian 
coordinate  frame.  The  details  of  acquiring  such  measurements  in  practice,  using 
either  on-board  wireless  devices,  or  vision  based  sensors,  are  described  in  Sec¬ 
tion  2.1.1.  Irrespective  of  the  technique  used,  the  relative  position  measurement 
so  obtained  will  have  errors,  and  hence  can  be  expressed  in  the  form  of  (1.1). 

Apart  from  localization,  there  are  several  problems  relevant  to  sensor  and 
actuator  network  applications  where  variables  are  to  be  estimated  from  noisy 
relative  measurements  of  the  form  (1.1),  which  include  time-synchronization  and 
motion  consensus.  In  time-synchronization,  a  network  of  nodes  whose  local  clocks 
progress  at  varying  speeds  (skews)  and  have  different  offsets  from  one  another 
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need  to  be  synchronized.  Two  nodes  that  can  communicate  with  each  other  can 
obtain  noisy  measurements  of  the  difference  between  their  offsets  and  ratio  of  their 
skews.  The  skews  and  offsets  of  all  the  nodes  with  respect  to  a  common  reference 
need  to  be  estimated  from  the  noisy  relative  measurements. 

In  motion  consensus,  a  group  of  mobile  nodes  need  to  estimate  their  velocities 
or  directions  with  respect  to  a  leader,  but  each  node  can  only  measure  its  relative 
velocity  with  respect  to  a  few  nearby  neighbors.  In  Chapter  2  we  will  describe 
these  problems  in  detail. 

Each  variable  xu  to  be  estimated  is  called  a  node  variable ,  which  is  associated 
with  the  node  u.  The  known  node  variables  are  called  reference  variables.  The 
in  (1.1)  is  called  a  relative  measurement,  or  sometimes  simply  measurement. 
Node  variables  in  general  are  vector  valued,  and  the  dimension  of  a  node  vari¬ 
able  is  denoted  by  k.  For  example,  in  the  localization  problem  k  can  be  2  or 
3,  depending  on  whether  the  nodes  are  located  in  a  2D  or  3D  space.  Note  that 
although  in  general  nodes  are  located  in  3D  space,  sometimes  the  third  dimension 
may  be  irrelevant.  This  estimation  problem  can  be  naturally  associated  with  a 
measurement  graph  Q  =  (£*,  £).  The  vertex  set  £*  of  the  measurement  graph 
consists  of  the  set  of  nodes  “V  :=  {l,...,n},  where  n  is  the  number  of  nodes, 
while  its  edge  set  £  consists  of  all  of  the  ordered  pairs  of  nodes  ( u ,  v)  such  that 
a  noisy  measurement  of  the  form  (1.1)  between  u  and  v  is  available.  The  mea¬ 
surement  errors  on  distinct  edges  are  assumed  uncorrelated.  The  variables  that 
are  known  are  called  reference  variables.  In  practice,  none  of  the  variables  may 
be  known,  in  which  case  we  arbitrarily  assign  one  of  the  nodes,  say  o  6  V,  to 
be  the  reference  node,  and  set  xQ  =  0.  The  measurement  graph  Q  is  a  directed 
graph  since  ( u ,  v)  G  £  implies  the  measurement  (u,v(=  xu  —  xv  +  eu^v)  is  available 
while  (v,u)  G  £  implies  the  measurement  (v,u(=  xv  —  xu  +  eVjU)  is,  and  these  two 
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measurements  are  distinct. 


1.1.1  Challenges  and  contributions 

The  error  ee  affecting  the  measurement  (e  (where  e  G  £)  can  be  quite  large  de¬ 
pending  on  the  application  and  sensing  technology.  The  goal  is  to  obtain  accurate 
estimates  of  all  the  node  variables  from  the  noisy  measurements.  An  estimate 
of  a  node  variable  xu  can  be  obtained  by  adding  the  measurements  (after  ap¬ 
propriately  modifying  their  signs)  along  a  path  from  u  to  a  reference  node.  For 
example,  in  Figure  1.1,  consider  the  undirected  path  V\  :=  {1,  e2,  2,  e4,  3,  e5, 4} 
from  the  reference  node  1  to  the  node  4.  By  adding  the  measurements  along  the 
path,  we  obtain  an  estimate  of  x4. 

x 4  =  2  -C4  +  C5  =  ~(x2  -  x,  +  e2)  -  (x2  -x3  +  e4)  +  (x4  -  x3  +  e5) 

=  x4  +  (— e2  —  e4  +  65) 

where  x\  vanishes  since  the  reference  variable  x\  is  assumed  to  be  0.  Since  the 
measurements  errors  are  assumed  uncorrelated,  the  covariance  of  the  estimation 
error  is  the  sum  of  the  covariances  of  the  measurement  errors  e2,e4,  and  65.  De¬ 
noting  by  Pe  the  covariance  of  the  measurement  error  ee,  i.e.,  Pe  :=  E[eeeJ],  the 
covariance  of  the  error  in  the  estimate  x4  is 

E[(£4  —  £4)(£4  —  x4)T]  =  P2  +  -P4  +  -P5. 

However,  it  is  possible  to  construct  another  estimate  of  x4  by  using  measurements 
along  another  path,  V2  :=  {1,  e4,  2,  e3, 4},  from  the  reference  1  to  4: 

£4  =  -Cl  -  C3  =  ~(xi  ~X2-  £1)  -  (x2  -X4  +  e3)  =  x4  +  ei  -  e3. 

The  covariance  of  the  error  in  the  estimate  x4  is  E[(£4  —  x4)(x4  —  x4)T }  —  P\  +  P3. 
If  all  the  measurement  error  covariances  are  equal,  the  error  covariance  in  the 
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Figure  1.1.  Estimating  node  variables  by  adding  relative  measurements  along  a 
path.  Two  different  paths,  Vi  and  V2,  are  shown,  that  go  from  1  to  4. 


estimate  £4  is  smaller  than  the  error  covariance  in  the  estimate  £4.  Still,  both  the 
estimates  above  uses  only  a  subset  of  the  available  measurements.  It  is  possible 
to  construct  a  more  accurate  estimate  by  using  all  the  available  measurements. 

We  will  describe  in  Chapter  2  how  the  optimal  estimate  of  all  the  node  variables 
can  be  computed  by  using  all  the  measurements.  The  optimal  estimate  refers 
to  that  obtained  by  the  classical  best  linear  unbiased  estimator  (BLUE)  that  is 
guaranteed  to  produce  the  minimum  variance  estimate  among  all  linear  unbiased 
estimators  [9] .  When  all  the  measurements,  their  associated  error  covariances,  and 
information  about  the  measurement  graph  Q  are  available  at  a  single  processor,  it 
can  compute  the  optimal  estimates.  Therefore  the  estimation  problem  described 
above  can  be  solved  by  first  sending  all  measurements  to  one  particular  node, 
computing  the  optimal  estimates  in  that  node,  and  then  distributing  the  estimates 
to  the  individual  nodes. 

However,  this  centralized  solution  is  undesirable  for  several  reasons.  First, 
when  wireless  communication  is  used,  this  method  unduly  burdens  the  nodes 
close  to  the  central  processor.  In  a  large  ad-hoc  network  of  wireless  nodes,  send¬ 
ing  all  of  the  measurements  requires  multi-hop  communication,  and  most  of  the 
data  transmitted  to  the  central  processor  have  to  be  routed  through  the  nodes 
close  to  it.  When  the  nodes  operate  on  batteries  with  small  energy  budgets,  this 
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mode  of  operation  greatly  reduces  the  life  of  the  nodes  that  carry  out  most  of  the 
communication,  ft  should  be  noted  that  the  primary  source  of  energy  consump¬ 
tion  in  wireless  sensor  networks  is  communication  [10],  while  much  less  energy  is 
consumed  for  computation  [11],  Second,  centralized  computation  is  less  robust 
to  node  and  link  failures  over  time.  Multi-hop  data  transfer  to  a  central  node 
typically  requires  the  construction  of  a  routing  tree  rooted  at  the  central  node. 
Failure  of  a  node  in  one  of  the  branches  of  the  routing  tree  effectively  cuts  off 
communication  from  all  of  the  nodes  in  the  tree  branch  rooted  at  the  faulty  node. 
In  addition,  construction  of  a  routing  tree  can  be  challenging  when  communica¬ 
tion  links  suffer  from  temporary  failures  or  when  nodes  are  mobile  [12],  Third,  a 
centralized  computation  renders  the  entire  network  susceptible  to  a  catastrophe  if 
the  central  processor  fails.  This  discussion  raises  a  key  issue  that  is  investigated 
in  this  dissertation: 

Question  1  (distributed  estimation) :  Is  it  possible  to  construct 
the  optimal  estimate  in  a  distributed  fashion  such  that  the  communica¬ 
tion  and  computation  burden  is  shared  equally  by  all  the  nodes?  If  so, 
how  much  communication  is  required  between  nodes,  and  how  robust 
is  the  distributed  algorithm  with  respect  to  communication  failures? 

By  a  distributed  algorithm  we  mean  an  algorithm  in  which  every  node  carries 
out  independent  computations  to  estimate  its  own  variable,  but  is  allowed  to  peri¬ 
odically  exchange  messages  with  a  set  of  neighbors  that  are  close  enough  to  it  so  as 
to  enable  communication.  We  show  that  it  is  indeed  possible  to  design  distributed 
algorithms  to  compute  optimal  estimates  that  are  robust  to  communication  faults. 
Our  contributions  in  this  regard  are  briefly  outlined  below: 

1.  In  Chapter  3  we  develop  and  analyze  two  distributed  algorithms  -  Jacobi  and 
OSE  -  for  computing  the  optimal  estimates  when  the  measurement  graph 


does  not  change  with  time.  These  algorithms  are  iterative,  and  the  estimates 
produced  by  the  algorithms  converge  to  the  optimal  ones  if  inter-node  com¬ 
munication  is  symmetric,  i.e. ,  if  a  node  u  can  receive  messages  from  another 
node  v,  then  v  can  also  receive  messages  from  u.  The  convergence  of  the 
algorithms  are  proved  to  be  robust  to  the  presence  of  temporary  communi¬ 
cation  failures.  We  relate  the  convergence  rate  of  the  Jacobi  algorithm  to 
the  spectral  properties  of  the  measurement  graph,  in  particular,  to  the  mini¬ 
mum  eigenvalue  of  a  matrix  that  describes  the  structure  of  the  measurement 
graph. 

2.  In  Chapter  3,  we  also  examine  the  effect  of  asymmetric  communication, 
which  refers  to  the  situation  where  a  node  u  can  receive  messages  from 
another  node  v,  but  v  cannot  receive  messages  from  u.  Such  asymmetry  is 
especially  common  in  ad-hoc  wireless  networks  on  account  of  inhomogeneous 
interference,  packet  collisions,  and  inaccurate  time  synchronization.  We 
show  that  in  presence  of  asymmetric  communication,  the  Jacobi  algorithm 
still  converges,  but  to  a  sub-optima!  estimate. 

The  convergence  rate  of  the  algorithms  proposed  here  can  be  slow  in  large 
graphs.  In  addition,  the  distributed  algorithms  we  have  proposed  are  not  appli¬ 
cable  when  the  measurement  graph  changes  with  time,  e.g.,  when  positions  of 
mobile  nodes  are  to  be  estimated  and  node  variables  are  dynamically  evolving  in 
time.  Therefore,  in  certain  situations,  a  centralized  computation,  or  perhaps  a 
combination  of  distributed  and  centralized  computation,  may  still  be  required. 

Irrespective  of  which  estimation  algorithm  is  used  to  estimate  the  node  vari¬ 
ables,  no  linear  unbiased  estimator  can  obtain  an  estimate  that  is  more  accurate 
than  that  of  the  BLUE.  This  offers  a  compelling  reason  to  study  the  error  in 
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the  BLUE,  since  the  accuracy  of  the  BLLIE  provides  a  fundamental  limit  to  the 
accuracy  achievable  by  any  estimation  algorithm. 

Sensor  networks  consisting  of  more  than  a  thousand  nodes  have  already  been 
developed  [13].  Furthermore,  it  is  envisioned  that  sensor  networks  consisting 
of  tens  of  thousands,  if  not  millions,  of  nodes  are  going  to  be  deployed  in  the 
near  future  [10].  Therefore,  understanding  limits  of  performance  in  large  graphs, 
particularly  limits  that  are  algorithm-independent,  is  important  if  such  networks 
are  to  be  deployed  successfully. 

For  these  reasons,  we  study  large  graphs  with  a  single  reference  node  and 
examine  how  the  error  covariance  of  a  node  variable’s  estimate  changes  as  the 
node’s  distance  from  the  reference  increases.  In  general,  one  expects  the  error  to 
grow  with  distance.  The  growth  rate  determines  the  size  of  the  graph  that  can 
be  “serviced”  by  a  single  reference  node.  For  a  given  acceptable  estimation  error, 
if  the  error  growth  rate  is  high,  either  the  graph  has  to  be  kept  small,  or  more 
reference  nodes  have  to  introduced. 

We  show  that  the  structure  of  the  measurement  graph  determines,  to  a  great 
extent,  how  the  estimation  error  of  a  node  will  vary  as  a  function  of  its  distance 
from  the  reference  node.  Evidence  in  support  of  this  statement  is  presented  in 
Figure  1.2,  which  shows  two  graphs  and  the  optimal  estimation  error  variances  of 
the  node  variables  in  each.  Both  the  graphs  Qa  and  Qb  are  obtained  by  placing 
nodes  randomly  in  the  plane  and  allowing  two  nodes  to  have  an  edge  between 
them  if  and  only  if  their  distance  is  less  than  a  certain  value.  In  case  of  Qa, 
nodes  are  allowed  to  fall  only  within  the  boundary  shown  in  dashed  lines.  One 
can  think  of  Qa  as  a  sensor  network  obtained  by  placing  nodes  randomly  in  an 
urban  terrain,  whereas  Qb  is  obtained  by  placing  nodes  in  a  large,  level  area.  Both 
graphs  have  one  reference  node,  placed  at  (0,0),  the  same  average  node  degree 
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of  3.2,  the  same  node  density  of  500  nodes  per  unit  area  of  the  deployed  region, 
and  the  same  measurement  error  variance  on  each  edge.  The  degree  of  a  node  is 
the  number  of  edges  incident  on  it,  therefore  same  node  degree  implies  the  same 
number  of  measurements  per  variable.  For  simplicity,  we  considered  the  case  of 
scalar  node  variables.  Computation  of  the  optimal  estimation  error  variances  is 
described  in  Chapter  2.  From  the  plot  of  the  estimation  error  variances,  we  see 
that  the  graphs  have  quite  distinct  estimation  error  growth  rates. 

This  example  motivates  the  second  issue  investigated  in  this  dissertation: 

Question  2  (error  scaling):  Can  different  graphs  exhibit  vastly  dif¬ 
ferent  error  scaling  laws,  i.e.,  the  rate  at  which  the  optimal  estimation 
error  covariance  of  a  node  variable  grows  as  a  function  of  the  node’s 
distance  from  the  reference?  If  so,  what  structural  properties  of  mea¬ 
surement  graphs  determine  these  scaling  laws,  and  how  do  we  identify 
these  properties  as  well  as  the  scaling  laws? 
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Figure  1.2.  Two  measurement  graphs  with  very  different  error  scaling  laws.  The 
two  graphs,  Ga  and  Gb  are  obtained  by  placing  nodes  randomly  in  the  plane  and 
allowing  two  nodes  to  have  an  edge  between  them  if  their  distance  is  less  than  a 
certain  value.  In  case  of  Ga,  nodes  are  allowed  to  fall  only  within  the  boundary 
shown  as  dashed  lines.  Both  graphs  have  one  reference  node,  placed  at  (0,  0).  Both 
graphs  have  the  same  average  node  degree,  namely  3.2,  the  same  node  density, 
namely  500  nodes/unit  area,  and  the  same  measurement  error  variance,  namely 
1,  on  each  edge.  The  bottom  plot  shows  the  trend  of  the  optimal  estimation 
estimation  error  variances  in  the  two  graphs  as  a  function  of  the  Euclidean  distance 
d(u,  o )  between  the  nodes  u  and  o  in  the  plane.  The  legend  A  refers  to  the  graph 
Ga,  and  B  refers  to  the  graph  Gb- 
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The  question  of  error  scaling  is  important  to  study  for  a  number  of  reasons. 
Given  a  maximal  acceptable  error,  the  number  of  nodes  whose  estimation  errors 
are  lower  than  this  level  is  large  if  the  graph  exhibits  a  slow  increase  of  variance 
with  distance,  but  small  otherwise.  These  scaling  laws  therefore  help  one  design 
and  deploy  large  networks  for  which  accurate  estimation  is  possible.  In  addi¬ 
tion,  knowledge  of  the  scaling  laws  for  the  optimal  estimation  error  provide  an 
algorithm- independent  limit  on  the  lowest  possible  error  growth,  since  the  optimal 
estimator  has  the  lowest  estimation  error  variance  among  all  linear  estimators. 

We  show  that  estimation  error  can  indeed  exhibit  vastly  different  scaling  laws 
depending  on  certain  structural  properties  of  the  graph.  Our  contributions  in  this 
regard  are  briefly  outlined  below: 

1.  In  Chapter  4,  we  examine  infinite  measurement  graphs,  in  which  the  number 
of  variables  and  measurements  are  countably  infinite.  We  show  that  under 
certain  conditions,  the  estimation  error  covariance  of  a  node  variable  in  a 
large  finite  graph  is  close  to  that  in  an  infinite  graph.  Intuitively,  for  a 
node  that  resides  sufficiently  inside  a  large  finite  graph,  i.e.,  not  close  to 
the  boundary,  the  graph  appears  to  extend  to  infinity  in  all  directions.  The 
results  in  Chapter  4  provides  formal  justification  for  using  infinite  graphs 
as  proxies  for  large  finite  graphs  and  also  establishes  the  conditions  under 
which  such  an  approximation  is  valid.  The  advantage  of  working  with  infinite 
graphs  is  that  boundary  conditions  in  infinite  graphs  are  weaker  than  in  finite 
graphs,  which  make  them  easier  to  analyze. 

2.  As  a  first  step  toward  answering  the  error-scaling  question,  we  prove  in 
Chapter  4  that  the  covariance  of  a  node’s  optimal  estimation  error  is  equal 
to  the  matrix-valued  effective  resistance  in  an  abstract  electrical  network 
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that  can  be  constructed  from  the  measurement  graph  and  measurement 
error  covariances. 

3.  In  Chapter  5  We  obtain  a  classification  of  graphs,  namely,  dense  or  sparse 
in  1  <  d  <  3,  that  determines  the  error  scaling  laws.  In  particular,  if 
a  graph  is  dense  in  1,2,  and  3D,  then  a  node  variable’s  estimation  error  is 
upper  bounded  by  a  linear,  logarithmic,  and  bounded  function  of  distance 
from  the  reference.  Corresponding  lower  bounds  are  obtained  if  the  graph 
is  sparse  in  1,  2  and  3D.  The  electrical  analogy  is  instrumental  in  obtaining 
the  error  scaling  laws.  The  sparse  graphs  are  simply  the  “graphs  that  can 
be  drawn  in  a  civilized  manner”  that  were  originally  introduced  by  Doyle 
and  Snell  [2], 

That  the  error  grows  with  distance  without  bound  in  many  graphs  is  perhaps 
not  surprising.  What  is  perhaps  surprising  is  that  there  are  graphs  in  which 
the  covariance  remains  below  a  constant  value  regardless  of  the  distance. 

Analogies  with  electrical  networks  are  used  in  [2,  14]  to  construct  elegant 
solutions  to  various  graph  problems,  notably  those  concerned  with  random  walks. 
In  [2],  questions  about  random  walks  in  certain  infinite  graphs  are  answered  by 
bounding  the  effective  resistance  in  those  graphs  with  that  in  lattices.  It  turns  out 
that  a  similar  approach  can  be  used  to  answer  the  question  of  error  scaling,  once 
we  establish  the  analogy  between  error  covariance  matrices  and  matrix-valued 
effective  resistances. 

We  note  that  scaling  laws  of  the  estimation  error  but  are  not  captured  by 
naive  measures  of  density  such  as  node  degree  or  node  and  edge  density,  which 
are  commonly  used  in  the  sensor  networks  literature  [15-17].  Using  the  dense  and 
sparse  classification  obtained  in  this  dissertation,  we  provide  counterexamples  that 
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expose  certain  misconceptions  that  exist  in  the  sensor  network  literature  about 
the  relationship  between  graph  structure  and  estimation  error.  These  counterex¬ 
amples  show  that  graphs  with  the  same  node  degree  can  exhibit  vastly  different 
error  scaling  laws. 


1.2  Part  II  :  Decentralized  control  with  relative 
measurements 

In  a  number  of  applications,  teams  of  mobile  autonomous  agents  are  required 
to  perform  tasks  in  a  collaborative  manner.  For  example,  consider  a  team  of 
autonomous  mobile  nodes  (UAVs,  ground  robots,  underwater  vehicles  etc.)  that 
are  required  to  maintain  a  particular  formation  while  in  motion.  The  formation  is 
specified  in  terms  of  desired  relative  positions  between  every  pair  of  nodes.  Each 
node  is  allowed  to  communicate  and  measure  its  relative  position  with  only  a 
small  subset  of  all  the  nodes.  Each  node’s  task  is  to  take  control  actions  using 
locally  available  measurements,  such  that  the  group  still  attains  its  collective  goal 
of  maintaining  the  desired  formation. 

Motivation  for  studying  formation  control  problems  arises  from  their  relevance 
in  a  wide  spectrum  of  problems,  from  military  surveillance  to  swarming  in  nature. 
Maintaining  a  formation  while  in  motion  can  reduce  aerodynamic  drag  in  air¬ 
crafts  [18,  19]  and  allegedly  in  birds  and  spiny  lobsters  as  well  [20-22],  increase 
traffic  capacity  in  highways  [6],  ensure  full  coverage  of  the  sensed  field  in  spite 
of  limited  sensing  capability  of  individual  nodes  [23],  and  build  extra-terrestrial 
interferometric  imaging  system  composed  of  multiple  satellites  [24], 

In  all  these  situations,  whether  man-made  or  natural,  it  is  reasonable  to  as- 


15 


sume  that  the  individual  nodes  have  access  to  only  relative  position  or  velocity 
measurements.  The  problem  of  formation  control  using  only  locally  available 
measurements  falls  under  the  broader  category  of  decentralized  coordination  prob¬ 
lems  [25].  In  such  problems  a  group  of  nodes  have  to  achieve  a  common  objective 
without  the  help  of  a  central  authority,  while  nodes  have  access  to  limited,  local 
information.  In  this  dissertation  we  use  the  term  decentralized  control  architecture 
to  refer  to  the  architectural  constraint  that  each  node  in  an  interconnected  sys¬ 
tem  is  allowed  to  use,  as  input  to  its  local  control  algorithm,  information  that  is 
available  to  it  or  that  it  collects  by  communicating  with  a  few  nearby  neighbors. 
This  is  in  contrast  to  a  centralized  architecture  in  which  information  gathered  by 
every  node  is  made  available  to  a  central  controller  that  computes  appropriate 
control  actions  for  all  the  nodes.  In  an  interconnected  system  of  many  constituent 
nodes  that  are  spatially  separated,  a  decentralized  architecture  is  desirable  over 
a  centralized  one  since  the  latter  suffers  from  large  communication  overhead.  In 
large  interconnected  systems  in  particular,  such  overhead  may  make  a  centralized 
architecture  well-nigh  impossible,  making  decentralized  architectures  the  only  pos¬ 
sibility. 

1.2.1  Contributions 

In  problems  of  decentralized  coordination,  including  the  specific  problem  of 
formation  control,  the  task  of  the  design  engineer  is  to  develop  control  algorithms 
for  all  the  nodes  of  a  sensor-actuator  network,  so  that  every  node  implements  a 
local  control  law  that  uses  only  locally  available  information,  while  the  network 
still  attains  its  collective  goal.  In  a  general  interconnected  system,  this  leads  to  a 
high  degree  of  complexity.  For  example,  if  there  are  n  agents  that  make  up  the 
interconnected  system,  in  principle  n  separate  control  algorithms  can  be  designed, 
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one  for  each  agent.  However,  tools  for  designing  controllers  for  an  interconnected 
system  which  incorporate  the  constraint  of  decentralized  architecture  explicitly 
are  not  well- developed  (with  a  few  notable  exceptions  such  as  [26-28]).  As  a 
result,  the  designer  is  often  forced  to  resort  to  a  somewhat  ad-hoc  design,  based 
on  trial  and  error.  Frequently,  it  is  arbitrarily  decided  that  every  agent  will  use 
the  same  control  algorithm,  so  that  the  problem  is  reduced  to  that  of  designing 
a  single  control  algorithm.  In  fact,  a  large  number  of  control  laws  examined  in 
the  literature  on  sensor-actuator  network  falls  into  this  category  [29,  30].  Even 
with  such  simplification,  performance  analysis  of  the  closed  loop  system  is  not 
straightforward  clue  to  lack  of  appropriate  analysis  tools.  In  summary,  a  clear 
understanding  of  the  effect  of  interconnection  structure  on  performance,  such  as 
stability  and  robustness  to  measurement  noise  etc.,  is  lacking. 

Motivated  by  these  issues,  we  have  examined  the  problem  of  decentralized  for¬ 
mation  control  from  relative  measurements.  We  outline  below  the  specific  aspects 
examined  and  our  contributions: 

1.  In  Chapter  6,  we  study  the  formation  control  problem  with  relative  mea¬ 
surements.  We  quantify  the  effect  of  interconnection  topology  on  noise- 
sensitivity  of  the  closed  loop  formation.  To  focus  on  the  effect  of  intercon¬ 
nection,  simple  forms  of  node  dynamics  and  control  laws  are  assumed.  We 
examine  a  control  law  that  uses  only  relative  measurements,  which  has  been 
extensively  used  in  the  literature. 

For  this  formation  control  problem,  we  show  that  the  covariance  of  the 
steady  state  error  (on  account  of  measurement  noises)  is  equal  to  a  matrix¬ 
valued  effective  resistance  in  an  abstract  electrical  network  that  can  be  con¬ 
structed  from  the  formation  graph.  This  effective  resistance  was  introduced 
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earlier  in  chapter  4,  and  the  effect  of  graph  structure  on  effective  resistance 
was  studied  in  Chapter  5.  Using  the  results  from  those  chapters,  we  show 
that  the  perfomance  of  the  algorithm  in  presence  of  noise  is  quite  sensitive 
to  interconnection  topology,  and  show  in  which  graphs  formation  control 
with  small  errors  is  possible  and  in  which  graphs  it  is  not  possible.  The 
analogy  with  effective  resistance  is  used  to  explain  certain  observations  in 
animal  formations. 

2.  The  stability  margin  of  the  formation  is  shown  to  depend  on  the  least  stable 
eigenvalue  of  the  Dirichlet  Laplacian  matrix  of  the  interconnection  graph. 
This  matrix,  originally  encountered  in  the  estimation  problem  studied  in 
part  I,  naturally  arises  in  control  and  estimation  problems  of  relevance  to 
interconnected  systems.  The  minimum  eigenvalue  of  the  Dirichlet  Laplacian 
matrix  also  determines  the  rate  of  convergence  of  many  algorithms  used  in 
control  and  estimation  problem  (in  both  continuous  and  discrete-time  set¬ 
tings).  This  eigenvalue  essentially  captures  the  rate  of  information  propa¬ 
gation  through  the  graph,  which  makes  it  a  key  player  in  the  convergence 
rate  analysis  of  such  algorithms. 

In  Chapter  6  we  obtain  a  bound  on  this  eigenvalue  in  terms  of  the  matrix¬ 
valued  effective  resistance  introduced  earlier  in  Chapter  4.  It  turns  out  that 
effective  resistances  provide  a  non-trivial  lower  bound  on  this  eigenvalue,  for 
which  few  tools  are  otherwise  available. 

3.  In  Chapter  7,  we  examine  the  the  problem  of  decentralized  control  of  vehic¬ 
ular  platoons,  in  which  the  control  objective  is  to  maintain  a  constant  inter- 
vehicular  separation.  For  this  problem,  we  allow  more  complex  controllers 
and  dynamics,  but  study  a  specific  interconnection  topology.  Interest  in  the 
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control  of  platoons  has  a  long  history,  dating  back  at  least  half  a  century 
(see  the  1958  paper  [31]). 

The  particular  architecture  we  study  in  this  chapter  is  known  as  symmetric 
bidirectional  control,  in  which  every  vehicle  uses  measured  relative  positions 
with  its  two  neighboring  vehicles,  and  every  vehicle  uses  the  same  controller 
that  is  furthermore  symmetric  with  respect  to  front  and  back.  That  is, 
the  spacing  error  with  respect  to  the  vehicle  in  front  has  equal  importance 
to  that  with  respect  to  the  vehicle  behind.  The  noise  sensitivity  of  this 
architecture  was  investigated  earlier  in  [32],  This  chapter  answers  a  few 
questions  left  unanswered  in  [32], 

We  show  that  the  stability  and  noise-sensitivity  of  this  architecture  depends 
on  the  number  of  integrators  in  the  loop  transfer  function  of  the  plant  (ve¬ 
hicle  dynamics).  If  there  are  more  than  two  integrators,  or  if  the  vehicle 
dynamics  is  non-minimum  phase,  closed-loop  will  become  unstable  for  a 
sufficiently  large  number  of  vehicles,  no  matter  how  the  controller  is  de¬ 
signed.  If  there  is  a  single  integrator,  the  steady  state  spacing  errors  for  a 
constant  velocity  reference  will  grow  without  bound  as  the  number  of  vehi¬ 
cles  increases,  but  for  two  integrators,  the  steady  state  errors  will  go  to  0 
for  an  arbitrary  number  of  vehicles.  When  there  are  no  integrators  in  the 
controller,  the  symmetric  bidirectional  architecture  suffers  from  the  “slinky 
effect” ,  namely,  the  measurement  noise  and  disturbances  acting  on  the  ve¬ 
hicles  will  be  amplified  without  bound  as  the  number  of  vehicles  increases. 
This  amplification  occurs  whether  the  vehicle  dynamic  model  has  either  one 
or  two  integrators. 

4.  In  order  to  ameliorate  some  of  the  limitations  of  the  symmetric  bidirectional 
architecture,  we  propose  a  methodology  to  design  separate  controllers  for  ev- 
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ery  vehicle.  To  handle  the  complexity  of  this  design  problem,  which  involves 
designing  2N  separate  controllers  (where  N  is  the  number  of  vehicles),  we 
use  an  alternate  modeling  framework. 

We  first  derive  from  first  principles  a  partial  differential  equation  (PDE) 
based  model  of  the  platoon  dynamics,  by  taking  a  continuum  approxima¬ 
tion  of  the  platoon  when  the  number  of  vehicles  is  large.  This  approach 
is  motivated  by  the  extensive  literature  exists  on  PDE  modeling  of  traffic 
flow.  We  design  the  control  gains  by  a  mistuning-based  method,  whereby 
the  gains  from  the  nominal,  symmetric  design  are  altered  by  small  amounts. 
This  mistiming  design  nevertheless  achieves  an  order  of  magnitude  improve¬ 
ment  compared  to  the  symmetric  design.  In  particular,  we  show  that  the 
least  stable  eigenvalue  of  the  closed  loop  platoon  dynamic  decays  to  0  as 
0(l/iV2)  in  the  symmetric  bidirectional  case,  where  N  is  the  number  of  ve¬ 
hicles.  However,  the  decays  is  only  0(1/ N)  with  the  mistiming  design,  even 
with  an  arbitrarily  small  amount  of  mistiming.  For  large  N,  this  results  in 
an  order  of  magnitude  improvement  in  the  stability  margin.  The  benefits  are 
seen  to  be  significant  even  for  small  values  of  N.  The  predictions  from  the 
PDE  analysis  are  corroborated  by  numerical  calculations  on  the  state-space 
representation  of  the  platoon  dynamics. 

In  summary  of  both  estimation  and  control  with  relative  measurements,  we 
note  that  the  concept  of  matrix-valued  effective  resistance  for  weighted  graphs 
introduced  in  this  dissertation  is  seen  to  be  useful  in  both  classes  of  problems. 
The  effective  resistance  was  shown  to  characterize  the  estimation  error  of  the 
optimal  estimates  in  the  first  part  of  this  dissertation,  and  the  above  discussion 
shows  that  it  is  also  relevant  in  the  study  of  stability  margin  and  noise-sensitivity 
of  formation  control  algorithms. 
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1.3  Notation 


A  list  of  the  notation  used  throughout  this  dissertation  is  provided  below  for 
easy  referral.  Specific  notation  is  introduced  where  it  is  first  used. 
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Tabic  1.1.  Notation 


Set  of  real,  natural  and  complex  numbers,  respectively 

Ik 

identity  matrix  in  MA:xfc 

gfc+ 

k  x  k  symmetric  positive  definite  matrices 

n 

real  Hilbert  space 

X  >  0 

X  is  a  positive  semi-definite  matrix 

X  h  0 

X  is  entry-wise  non-negative 

6^ 

graph 

l/,  n  or  N 

node  set,  number  of  nodes 

“Ur,  nr 

set  of  reference  nodes,  number  of  reference  nodes 

£ 

edge  set 

fi-fuzz  of  Q,  d-dimensional  square  lattice 

e  ~  u 

edge  e  is  either  (u,  v)  or  (v,  u )  for  some  v 

A,  Ab 

incidence  matrix,  basis  incidence  matrix 

A 

generalized  incidence  matrix,  A®  R 

££ 

(generalized,  or  matrix-weighted)  graph  Laplacian. 

v,  e 

(matrix- weighted)  degree  matrix,  adjacency  matrix 

L 

(matrix-weighted)  Dirichlet  Laplacian  matrix 

M,N 

(matrix- weighted)  basis  degree  matrix,  basis  adjacency  matrix 

L,  M,  N 

scalar  weighted  versions  of  L,  M,  N 

Pe 

covariance  of  the  measurement  error  on  edge  e  G  £ 

Re 

matrix  resistance  on  edge  e. 

TDe  ff 

generalized  effective  resistance  between  u  and  v 

y 

^ u,o 

error  covariance  of  xRs  BLLIE  estimate,  with  o  as  the  reference 

3,  j 

flow,  flow  intensity 

b  i 

current,  current  intensity 
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Part  I 


Estimation  with  Relative 
Measurements 
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Chapter  2 


Estimation  with  relative 
measurements:  applications  and 
the  optimal  estimator 


In  this  chapter  we  formally  describe  the  problem  of  estimation  from  rela¬ 
tive  measurements.  First,  in  Section  2.1,  we  describe  the  applications  -  which 
mostly  come  from  sensor  and  actuator  networks  -  where  this  estimation  problem 
is  relevant.  These  include  sensor  localization,  camera  network  calibration,  time 
synchronization,  motion  coordination,  and  mobile  robot  localization.  We  discuss 
the  importance  of  these  problems,  and  describe  in  detail,  in  each  case,  how  the 
relative  measurements  are  obtained  in  practice.  Then  Section  2.2  formally  defines 
the  optimal  estimation  problem  in  terms  of  a  graph,  and  presents  the  solution  to 
the  optimal  estimation  problem. 
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2.1  Applications 


2.1.1  Sensor  network  localization 

Consider  a  network  of  sensor  nodes  that  are  deployed  in  a  large  geographical 
area.  Nodes  in  a  sensor  network  are  often  not  equipped  with  GPS,  since  they 
are  required  to  be  small,  cheap,  and  are  expected  to  operate  for  a  long  time  with 
a  battery  of  limited  life  [10].  On  the  other  hand,  in  almost  all  potential  and 
realized  applications  of  sensor  networks  that  impose  these  constraints,  such  as 
habitat  monitoring  [33],  forest  fire  detection  [34],  hazardous  area  and  perimeter 
surveillance,  structural  health  monitoring  [35],  military  reconnaissance  and  target 
tracking  [36],  knowledge  of  the  nodes’  locations  is  critical  for  the  user  of  the 
network.  The  localization  problem  consists  of  estimating  node  locations  from 
measurements  that  the  sensors  can  provide.  Although  a  sensor  does  not  know  its 
position  in  a  global  coordinate  system,  it  can  usually  measure  its  position  relative 
to  a  set  of  nearby  nodes.  These  measurements  can  be  obtained  in  a  number  of 
ways  that  depend  on  the  sensing  technology  available  and  the  application  domain, 
which  are  described  below. 

When  the  sensor  nodes  in  question  are  equipped  with  wireless  devices,  range 
measurements  can  be  obtained  by  a  variety  of  techniques,  such  as  received  signal 
strength  [37]  and  time  of  arrival  [38]  measurements.  In  certain  scenarios,  they 
can  be  fitted  with  acoustic  ranging  devices  [39].  Angle  measurements  with  small 
form-factor  devices  is  more  challenging,  though  possible  -  albeit  with  limited 
accuracy  -  with  switched  microstrip  antenna  arrays  [40].  Assuming  that  each 
node  has  a  local  compass  to  measure  bearing  with  respect  to  a  common  North, 
noisy  measurements  of  rU)V  and  9U)V ,  range  and  bearing,  between  a  pair  of  sensors 


25 


u  and  v  are  converted  to  noisy  measurements  of  relative  position  in  the  x  —  y  plane 


as 


C u,v 


ru,v  cos  9U 
r,..v  sin  9n 


The  same  procedure  is  performed  for  every  pair  of  sensors  that  can  measure  their 
relative  range  and  bearing.  Since  the  range  and  bearing  measurements  have  errors, 
the  relative  position  Qu,v  measured  in  Cartesian  coordinate  also  has  error,  which 
can  be  approximated  as  an  ellipsoidal  region  characterized  by  a  covariance  matrix 
(See  Figure  2.1  for  a  schematic).  Measurement  errors  between  distinct  pairs  of 
nodes  can  be  assumed  uncorrelated. 


When  the  sensor  nodes  have  on-board  cameras,  it  is  again  possible  to  measure 
relative  positions  between  pairs  of  nodes  whose  cameras  have  an  opportunity 
to  view  a  common  object  in  an  overlapping  held  of  view  (see  Figure  2.2  for  a 
schematic).  Measuring  the  relative  position  and  orientation  between  two  cameras 
involve  collaborative  information  gathering  and  processing,  and  is  referred  to  as 
camera  network  calibration.  The  reader  is  referred  to  [42]  and  references  therein 
for  the  details  of  obtaining  such  measurements.  Typically,  when  two  cameras  u 
and  v  take  part  in  calibration  and  exchange  their  local  calibration  parameters, 
one  of  the  cameras,  say  u,  estimate  the  relative  position  of  v  w.r.t.  itself  in  it 
local  coordinate  frame,  denoted  by  pv,u  (which  is  either  a  2- vector  or  a  3- vector). 
Assuming  that  the  rotation  matrix  Tuo  that  specifies  the  rotation  from  u’s  local 
coordinate  to  the  common  global  coordinate  frame  attached  to  the  reference  node 
o  is  available  to  u,  it  can  estimate  the  position  of  node  v  w.r.t.  itself  (i.e.,  xv  —  xu) 
in  a  common  Cartesian  reference  frame  as 


Cvu  TuoPvu- 

As  long  as  the  errors  in  the  estimated  quantities  Tou  and  pvu  are  additive  and 
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Figure  2.1.  Relative  position  measurement  between  pairs  of  nodes  t  in  a  Cartesian 
reference  frame  using  range  and  bearing  measurements.  Noisy  measurements  of 
range  and  bearing  can  be  converted  to  noisy  measurements  of  relative  position 
as  Cu,v  =  [ru,v  cos0U)V,  rU}V  sin  9UjV]T.  The  errors  in  range  and  bearing  result  in  an 
error  in  relative  position.  Although  this  error  in  general  leads  to  a  non-convex 
uncertainty  region,  it  can  be  approximated  as  an  ellipsoidal  one  (shown  as  the 
patterned  region),  which  is  characterized  by  a  measurement  error  covariance  ma¬ 
trix.  An  example  of  how  the  measurement  noise  covariance  can  be  estimated  can 
be  found  in  Section  3.3.3. 

zero  mean,  the  error  in  this  relative  position  measurement  is  also  additive  and 
zero  mean.  The  rotation  matrix  Tuo  can  be  estimated  as  the  product  of  the 
rotation  matrices  in  a  path  from  o  to  u  in  the  “graph”  that  describes  relative 
calibrations.  For  example,  if  (u,v),  (v,w),  and  (w,o)  are  pairs  of  nodes  such 
that  their  calibration  parameters  are  known,  then  an  estimate  of  Tuo  is  Tuo  = 
T  T  T 

In  both  the  situations  described  above,  two  nearby  sensors  u  and  v  located  at 
positions  pu  and  pv,  respectively,  have  access  to  the  measurement 

Cu,v  Pu  Pv  T  £u,vi  IF 
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Figure  2.2.  Relative  position  measurement  between  two  cameras  in  a  common 
Cartesian  reference  frame.  The  cameras  have  an  overlapping  held  of  view  and 
have  the  opportunity  to  view  an  object  of  known  size  and  shape.  Each  camera 
then  estimates  its  orientation  with  respect  to  the  object  and  its  distance  from  the 
object  [41].  By  exchanging  this  information,  the  two  cameras  can  estimate  their 
relative  position  and  orientation.  See  [42]  for  more  details. 

where  eU)V  denotes  measurement  error.  The  dimension  of  the  positions  and  relative 
measurements,  k ,  can  be  either  2  or  3,  or  even  1  in  special  cases.  The  problem  of 
interest  is  to  use  the  Cu,vs  to  estimate  the  positions  of  all  the  nodes  in  a  common 
coordinate  system  whose  origin  is  fixed  arbitrarily  at  one  of  the  nodes. 

We  do  not  consider  the  problem  of  localization  from  range  measurements  alone 
(or  angle  measurements  alone),  on  which  an  extensive  literature  exists  [39,  43-50]. 
When  only  range  measurements  are  available,  the  relationship  between  measure¬ 
ments  and  variables  are  non-linear.  The  difficulty  of  this  non-linear  problem  is 
well- recognized,  especially  in  the  presence  of  noise  [47].  Recognizing  this  diffi¬ 
culty,  localization  with  both  range  and  angle  measurements  is  being  examined 
recently  [51,  52], 
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2.1.2  Time  synchronization 


Consider  a  set  of  nodes  forming  a  multi-hop  communication  network,  where 
each  node  has  a  local  clock.  Clocks  have  two  sources  of  inaccuracy  in  practice: 
skew  and  offset.  Skew  refers  to  the  rate  at  which  clocks  measure  time  and  offset 
refers  to  the  difference  between  the  local  times  of  two  clocks  that  have  the  same 
skew.  At  a  particular  “global”  time  t,  the  measured  local  time  at  a  clock  can  be 
modeled  by  a  +  fit,  where  a  is  the  offset  w.r.t  the  global  time  and  fi  is  the  skew. 
Time  synchronization  consists  of  estimating  the  skews  and  offsets  of  all  the  nodes 
with  respect  to  a  common  reference  so  that  every  node  can  read  off  the  global 
time  from  its  local  clock. 

Time  synchronization  in  sensor  and  actuator  networks  is  important  for  a  num¬ 
ber  of  reasons.  First,  sensor  nodes  need  to  coordinate  and  collaborate  to  achieve 
their  sensing  tasks.  In  target  tracking,  for  example,  the  location  of  a  target  and  its 
trajectory  is  estimated  from  the  reports  by  the  sensor  nodes  on  when  they  sensed 
the  target  [53].  Second,  to  operate  for  a  long  time  with  limited  battery  power, 
sensor  nodes  typically  turn  off  power  consuming  components  for  long  periods  and 
wake  up  at  predetermined  times  [54],  Such  sleep-scheduling  requires  a  precise 
timing  between  nodes.  Third,  communication  protocols  such  Time  Division  Mul¬ 
tiple  Access  (TDMA)  requires  time  synchronization  for  scheduling  communication 
between  wireless  devices.  Fourth,  feedback  control  in  a  sensor-actuator  network 
requires  knowledge  of  a  common  time  [55]. 

The  relative  skew  and  offset  between  a  pair  of  nodes  in  a  network  can  be 
measured  (upto  some  error)  by  several  methods,  which  are  described  below. 

Case  A:  offset  without  skew:  Consider  first  the  case  when  all  clocks  have  the 
same  skew  but  have  different  offsets.  All  skews  can  be  assumed  to  be  1  without  loss 
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of  generality.  Suppose  that  nodes  u  and  v  in  Figure  2.3  can  communicate  directly 
with  each  other  and  have  clock  offsets  tu  and  tv  with  respect  to  a  reference  clock. 
Node  u  transmits  a  message,  say,  at  global  time  t,  while  transmitter  u' s  local  time 
is  Ttu  —  t  +  tu.  The  receiver  v  receives  this  message  at  a  later  time,  when  its 
local  clock  reads  Trv  =  t  +  tv  +  8U}V,  where  8UjV  is  the  random  transmission  delay 
from  u  to  v.  The  transmission  delay  in  fact  arises  from  several  hardware-level 
issues  at  both  the  transmitter  and  the  receiver,  such  as  the  randomness  in  the 
processor  load,  delay  in  accessing  the  medium,  and  length  of  the  messages.  An 
extensive  and  excellent  description  of  these  issues  can  be  found  in  [56] .  Some  time 
later,  say  at  t'  (global  time),  node  v  sends  a  message  back  to  u,  when  its  local 
time  is  t[v  —  t'  +  tv.  This  message  includes  the  values  rrv  and  r[v  in  the  message 
body.  Receiver  u  receives  this  message  at  local  time  r'ru  —  t'  +  tu  +  Suv,  where 
the  delay  8VU  has  the  same  mean  as  the  delay  Suv.  Node  u  can  now  estimate  the 
clock  offsets  as  (UjV  =  \  [{r'ru  -  t[v)  -  (rrv  -  rtu)\  =  tu  -  tv  +  (Svu  -  Suv)/2.  The 
error  eUjV  :=  (5VU  —  5UV)/ 2  has  zero  mean  as  long  as  the  delays  Suv  and  Svu  have 
the  same  expected  value.  The  measured  clock  offset  between  u  and  v  is  now 

C U,V  tu  tv  ~\~  £u,V  £ 

which  is  of  the  form  (1.1).  Similarly,  the  measurement  of  clock  offsets  between 
nodes  v  and  w  is  C,V)W  =  tv  —  tw  +  eVjW.  Note  that  for  the  measurement  errors  to 
be  zero  mean,  the  transmission  delay  between  u  and  v  has  to  have  the  same  mean 
as  the  one  between  v  and  u.  The  task  is  now  to  estimate  the  clock  offsets  with 
respect  to  the  global  time,  which  is  defined  to  be  the  local  time  at  some  reference 
node. 

Case  B:  offset  and  skew:  Now  let  us  consider  the  general  case  when  both  clock 
skews  and  offsets  are  present  and  need  to  be  estimated.  Suppose  node  u  transmits 
two  messages  to  v,  the  first  one  at  (global)  time  t\  and  the  second  one  at  (global) 
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Figure  2.3.  Measurement  of  differences  in  local  times  by  bidirectional  exchange 
time-stamped  messages. 


time  t-2-  Define  T  —  t2  — 1\  and  denote  by  au  and  av  the  clock  skews  of  node  u  and 
v  relative  to  a  reference  clock.  Then  the  time  interval  of  transmission  recorded 
by  u  is  mi  :=  auT.  On  the  other  hand,  these  two  messages  are  received  by  node 
v  at  local  times  av(ti  +  Suv)  +  tv  and  av{t\  +  S'uv)  +  tv,  respectively,  where  5UV 
and  8'uv  are  two  realizations  of  the  random  transmission  delay  from  u  to  v.  Node 
v  can  then  compute  the  difference  between  the  local  reception  times,  which  is 
m2  :=  avT(l  +  Suv^Suv).  Node  v  then  sends  the  measurement  m2  back  to  node  u. 
From  the  numbers  m\  and  m2  available  to  it,  node  u  can  compute  the  following 

log  —  =  log— (1  +  5uv  Suv)  =  log  av  -  log  au  +  log(l  +  Suv  S uv ) 
m\  au  1  1 

1  1  i  ^UV  ^  UV 

~  log  olv  -  log  au  H - — - , 

under  the  assumption  that  Suv  —  S'uv  «  T.  Define  xu  :=  logau  and  xv  :=  loga„ 
and  euv  :=  5uv~s™ ,  Then  the  above  measurement  is  equivalent  to 


(vu  :=  log  —  =  xv 


Xu  £v 


which  is  of  the  form  (1.1).  Note  that  euv  is  zero-mean  random  variable  as  long  as 
5UV  and  8'uv  have  the  same  expected  value.  In  this  way,  noisy  relative  skews  can 
be  measured  between  pairs  of  nodes  that  can  exchange  time-stamped  messages. 
To  estimate  the  offset,  one  first  scales  the  local  time  of  one  of  the  sensors  with 
the  already  estimated  relative  skew,  so  that  in  the  new  time  coordinate,  the  only 
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Figure  2.4.  RBS:  measurement  of  differences  in  local  times  by  unidirectional 
message  exchange  [1].  Both  u  and  v  receives  a  message  transmitted  by  source 
s  at  the  same  time.  The  local  time  at  u  when  it  received  this  message  is  later 
broadcast.  Upon  receiving  this  message,  node  v  can  measure  its  relative  offset 
with  u.  Communication  from  v  to  u  is  not  needed.  So  RBS  allows  relative  offset 
measurements  even  when  communication  is  asymmetric. 

difference  between  the  local  times  is  an  offset.  This  offset  can  then  be  measured 
as  described  earlier.  At  the  end,  it  is  scaled  back  to  obtain  an  noisy  estimate  of 
the  true  offset.  This  way,  both  relative  skews  and  offsets  are  measured  that  follow 
the  measurement  model  (1.1). 

All  the  measurement  techniques  are,  however,  susceptible  to  measurement 
errors,  especially  when  wireless  communication  is  involved.  Therefore  the  error 
term  ee  on  a  measurement  ()e  has  to  be  paid  careful  attention  to.  These  errors  come 
from  several  hardware-level  issues  at  the  both  transmitter  and  receiver,  such  as 
the  randomness  in  the  processor  load,  delay  in  accessing  the  medium,  and  length 
of  the  messages.  An  extensive  description  of  these  issues  can  be  found  in  [56]. 

The  measurement  techniques  outlined  above  require  bidirectional  message  ex¬ 
change  between  pairs  of  nodes.  Measurements  of  relative  skews  and  offsets  be¬ 
tween  a  pair  of  nodes  can  also  be  obtained  by  the  RBS  (Reference  Broadcast 
System)  method,  which  does  not  require  bidirectional  message  exchange  between 
them  but  requires  the  involvement  of  a  third  node  [1].  The  RBS  method  is  ex- 
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plained  in  Figure  2.4  briefly.  Node  p  broadcasts  a  message  marked  as  a  synchro¬ 
nization  signal  to  its  neighbors,  and  since  time  of  propagation  of  radio  waves  is 
negligible  compared  to  the  other  sources  of  delay  in  transmission  and  reception, 
nodes  u  and  v  receive  the  message  at  the  same  global  time,  say  t.  From  the 
time  the  messages  is  processed  by  the  receiver  antenna  and  physical  layer  of  the 
wireless  device  to  the  time  it  arrives  at  the  application  layer  of  the  protocol  stack 
where  the  local  time  can  be  recorded,  there  will  be  a  processing  delay.  Therefore, 
the  local  times  at  nodes  u  and  v  recorded  as  the  time  of  reception  of  the  same 
message  are  Tru  :=  t  +  t?  +  tu  and  rrv  t  +  t%  +  tv,  respectively,  where  i£,  t?  are 
receiver-side  processing  delays  at  u  and  v.  Node  u  then  sends  the  recorded  receive 
time  to  v  (or  vice  versa),  and  v  can  estimate  the  difference  between  their  clock 
offsets: 


Cv,u  1~rv  7~ru  tv  tu  T  (t^  tv  tu  T 


where  eVjU  t?  —  t?  is  zero- mean  as  long  as  the  processing  delays  at  both  the 
receivers  u  and  v  have  the  same  expected  value,  ft  should  be  stressed  that  such  a 
relative  measurement  is  available  to  v  as  long  as  it  can  receive  messages  from  u, 
even  if  u  is  unable  to  receive  messages  from  v. 


2.1.3  Motion  Consensus 

Consider  the  situation  where,  in  a  group  consisting  of  several  mobile  agents, 
each  agent  wants  to  determine  its  velocity  with  respect  to  the  velocity  of  a  leader 
using  only  measurements  of  its  relative  velocities  with  respect  to  nearby  agents. 
These  measurements  can  be  obtained,  for  example,  by  using  vision-based  sen¬ 
sors.  In  particular,  two  nearby  agents  u  and  v  moving  with  velocities  pu  and  pv, 
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respectively,  have  access  to  the  measurement 


Cu,V  Pu  Pv  T  (-U,V7 

where  denotes  measurement  error.  The  task  is  to  determine  the  velocity 
of  each  agent  with  respect  to  the  leader  based  solely  on  the  available  relative 
velocities  between  pairs  of  neighboring  agents.  The  same  problem  arises  when  the 
agents  are  trying  to  estimate  their  headings  with  respect  to  that  of  a  leader  using 
noisy  measurements  of  relative  headings  between  certain  pairs  of  agents. 

A  similar  situation  arises  in  a  group  of  mobile  nodes  when  pairs  of  nearby 
nodes  can  measure  their  relative  heading,  and  each  agent  wants  to  estimate  its 
heading  with  respect  to  a  leader. 

2.1.4  Mobile  robot  localization 

Consider  a  group  of  mobile  robots  such  that  certain  pairs  of  robots  can  mea¬ 
sures  their  relative  positions  periodically,  which  can  be  obtained  either  by  vision- 
based  techniques  [57]  or  by  a  stereo-ranging  device  [58],  or  by  one  of  the  tech¬ 
niques  explained  in  the  previous  section.  Furthermore,  each  robot  can  measure 
how  much  it  has  moved  in  a  given  time  interval,  which  can  be  obtained  by  dead¬ 
reckoning  [59] .  The  problem  is  to  measure  the  position  of  each  robot  at  the  current 
time  based  on  all  the  relative  position  measurement. 

Figure  2.5  depicts  such  a  situation  schematically.  Two  types  of  relative  position 
measurements  are  available:  (i)  those  between  two  distinct  robots  at  the  same 
time  instant:  Cut,vt  —  xut  ~  xvt  +  eut,vti  and  (h)  those  between  the  same  robot  at 
two  consecutive  time  instants:  (Ut+1  ,Ut  —  xut+ 1  —  xut  +  £ut+1,uf  In  terms  of  the 
estimation  problems  described  above,  the  robot  positions  at  various  times  can  be 
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thought  of  as  variables  to  be  estimated.  Since  only  the  initial  position  of  one  robot 
is  assumed  known,  the  number  of  variables  to  be  estimated  grows  with  time. 


Figure  2.5.  A  group  of  mobile  robots.  Snapshots  of  the  positions  of  four  robots  at 
five  different  time  instants  are  shown  schematically.  The  unknown  variables  are 
the  robot  positions  xu(t),  u  =  0, 1, . . . ,  N,  where  Ms  the  number  of  robots,  at  the 
time  instants  t  =  1,  2, . . . ,  T,  where  T  is  the  current  time  index,  except  for  the 
initial  position  of  robot  1:  xi(0),  which  is  taken  as  the  reference.  Two  types  of 
relative  position  measurements  are  available:  (i)  those  between  two  distinct  robots 
at  the  same  time  instant  -  Cut,vt  —  xut  ~xv t  +  eut,vt,  and  (ii)  those  between  the  same 
robot  at  two  consecutive  time  instants  Cut+i,ut  —  xut+ 1  —  xut  +  eut+i,ut ■  Since  only 
the  initial  position  of  one  robot  is  assumed  known,  the  number  of  variables  to  be 
estimated  grows  with  time.  In  the  terminology  of  Section  2.2,  the  measurement 
graph  is  a  function  of  time. 
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2.2  Measurement  graph  and  optimal  estimation 


The  estimation  problem  can  be  posed  in  terms  of  a  directed  graph  Q  —  ((lS,  *E) 
whose  vertices  or  nodes  represent  variables  and  whose  edges  represent  noisy  rel¬ 
ative  measurements.  That  is,  for  every  e  6  £,  a  measurement  of  the  following 
form  is  available: 

Ce  =  xu  —  xv  +  ee,  Ve  =  (u,  v )  G  ‘E,  u,  v  e  'ZA  (2.1) 

where  ee  is  measurement  error.  The  graph  Q  is  called  the  measurement  graph.  In 
the  sequel,  we  use  the  symbol  e  to  denote  not  only  an  edge  (u,  v )  but  also  the 
index  of  the  edge  as  well,  so  that  e  can  take  values  in  the  set  {1,2,...,  m},  where 
m  is  the  total  number  of  measurements.  The  covariance  of  the  measurement  error 
ee  is  Pe  :=  E[eeeJ].  The  measurement  error  covariances  Pe,  e  E  £  are  assumed 
known.  The  measurement  errors  on  distinct  edges  are  assumed  uncorrelated,  i.e., 
E[eee?]  =  0  if  e  7^  e.  With  relative  measurements  alone,  determining  xu  is  possible 
only  up  to  an  additive  constant.  To  avoid  this  ambiguity,  we  assume  that  at  least 
one  of  the  nodes  is  used  as  a  reference  by  all  of  the  nodes,  and  therefore  its  node 
variable  can  be  assumed  known.  When  several  node  variables  are  known,  we 
can  have  several  references.  The  set  of  reference  nodes  is  denoted  by  where 
Vr  C  V .  An  edge  e  =  (u,  v)  is  said  to  be  incident  on  the  nodes  u  and  v.  We  write 
e  ~  u  to  denote  that  e  is  incident  on  u. 

Depending  on  the  application,  the  measurement  graph  can  vary  with  time. 
In  the  sensor  localization,  time  synchronization,  and  motion  consensus  problems 
described  earlier,  it  was  implicitly  assumed  that  the  variables  are  fixed  and  mea¬ 
surements  are  obtained  once.  As  a  result,  the  measurement  graph  was  time- 
invariant.  However,  in  problems  such  mobile  robot  localization,  new  variables 
and  measurements  appear  over  time,  and  consequently  the  measurement  graph 
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is  time-varying.  When  the  application  leads  to  a  time-varying  graph,  we  can 
sometimes  examine  the  graph  obtained  by  collecting  all  the  variables  and  mea¬ 
surements  over  a  time-period.  In  this  dissertation  we  only  consider  measurement 
graphs  that  are  time- invariant. 

2.2.1  The  optimal  estimator  (BLUE)  and  the  optimal  es¬ 
timates 

The  task  is  to  estimate  all  of  the  unknown  node  variables  from  the  measure¬ 
ments  and  the  reference  variables.  We  examine  the  optimal  estimate  of  the  node 
variables.  The  optimal  estimates  refer  to  the  ones  obtained  by  the  best  linear 
unbiased  estimator  (BLUE),  which  has  the  minimum  variance  among  all  linear 
estimators  [9]. 

Consider  a  measurement  graph  Q  with  n  nodes  and  m  edges.  Recall  that  k  is 
the  dimension  of  the  node  variables.  Let  X  be  a  vector  in  Rnk  obtained  by  stacking 
together  all  the  node  variables,  known  and  unknown,  i.e. ,  X  :=  . . .  ,xf]T. 

Define  z  :=  [(f ,  (f, ....,  £^]T  G  Rkm  and  e  :=  [ef ,  e^, ...,  e^]T  G  Rkm.  This  stacking 
together  of  variables  allows  us  to  rewrite  (2.1)  in  the  following  form: 

z  =  ATX.  +  e,  (2.2) 

where  A  is  a  matrix  uniquely  determined  by  the  graph.  To  construct  A,  we  start 
by  defining  the  incidence  matrix  A  of  the  graph  Q,  which  is  an  n  x  m  matrix  with 
one  row  per  node  and  one  column  per  edge  defined  by  A  :=  [aue\ ,  where  aue  is 
nonzero  if  and  only  if  the  edge  e  G  *E  is  incident  on  the  node  u  G  “V  [60].  When 
nonzero,  aue  =  —  1  if  the  edge  e  is  directed  towards  u  and  aue  =  1  otherwise.  The 
matrix  A  that  appears  in  (2.2)  is  an  expanded  version  of  the  incidence  matrix  A, 
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defined  by 


A  A  8  4, 


(2.3) 


where  4  is  the  k  x  k  identity  matrix  and  (8)  denotes  the  Kronecker  product. 
Essentially,  every  entry  of  A  is  replaced  by  a  matrix  of  the  form  aueIk  to  construct 
the  matrix  A  (see  Figure  2.6  for  an  example).  We  call  A  the  generalized  incidence 
matrix  of  Q. 


Let  nr  denote  the  number  of  reference  variables  and  rift  denote  the  number  of 
unknown  variables.  Let  Ab  be  the  submatrix  of  A  that  is  obtained  by  removing 
those  rows  from  A  that  correspond  to  the  reference  nodes  in  '4r,  so  that  it  contains 
only  those  rows  of  A  that  correspond  to  the  nodes  in  4  \  <Vr.  The  matrix  Ab 
is  called  the  basis  incidence  matrix  [61].  Clearly,  the  incidence  matrix  can  be 
decomposed  as 


A  = 


Ab 

Ar 


where  Ab  E  MnbXm  and  Ar  e  MnrXm.  Define 


Ab  • —  Ab  8)  4,  Ar  • —  Ar  8  4> 

where  Ab  is  now  termed  generalized  basis  incidence  matrix  of  Q  with  reference 
node  set  Vr- 

By  partitioning  X  into  a  vector  x  G  W:nb  containing  all  the  unknown  node 
variables  and  another  vector  xr  e  Mfcnr  containing  all  the  known  reference  node 
variables:  X1  =  [x.f,xI]T,  we  can  re-write  (2.2)  as 

z  =  Af  xr  +  Ajx  +  e, 

where  Ar  contains  the  rows  of  A  corresponding  to  the  reference  nodes  and  Ab  con¬ 
tains  the  rows  of  A  corresponding  to  the  unknown  node  variables.  The  equation 
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above  can  be  further  rewritten  as: 


z  =  Ajx  +  e,  (2.4) 

where  z  :=  z  —  Ajxr  is  a  known  vector. 

Estimation  of  the  unknown  node  variables  in  the  vector  x  based  on  the  linear 
measurement  model  (2.4)  is  a  classical  estimation  problem.  Since  e  is  a  random 
vector  with  zero  mean  and  covariance  matrix 

$:=  E[eeT],  (2.5) 

the  BLU  estimate  x*  of  x  is  the  solution  to  the  system  of  linear  equations  [9] 

Ax*  =  b,  (2.6) 

where 

L  :=  A,?-1  A l  (2.7) 

b  :=  *4fclP-1(z  —  A^'X-r)-  (2-8) 

Since  the  measurement  errors  on  two  different  edges  are  uncorrelated,  IP  is  a 

symmetric  positive  definite  block  diagonal  matrix  with  the  measurement  error 
covariances  along  the  diagonal:  P  =  diag(Pi ,  P2, . . . ,  Pm)  G  IRfcmxfcm,  where  Pe  is 
the  covariance  of  the  measurement  error  ee. 

The  next  theorem  establishes  necessary  and  sufficient  conditions  on  the  mea¬ 
surement  graph  Q  so  that  the  optimal  estimate  of  node  variables  is  unique  and 
shows  how  the  covariance  of  the  estimation  error  x  —  x*  relates  to  the  matrices 
associated  with  the  graph  Q .  The  existence  and  uniqueness  condition  is  one  of 
weak  connectivity  of  the  directed  graph  Q.  A  directed  graph  is  weakly  connected 
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Figure  2.6.  A  measurement  graph  Q  and  a  few  of  its  associated  matrices:  the 
incidence  matrix  A,  the  generalized  incidence  matrix  A ,  the  generalized  basis 
incidence  matrix  Ah,  and  the  edge-covariance  matrix  IP.  The  row  and  column 
indices  of  A  correspond  to  node  and  edge  indices,  respectively.  The  single  positive 
entry  in  each  column  of  A,  namely  1,  indicates  the  start  node  of  the  corresponding 
edge  in  Q ,  while  the  single  negative  entry  —1  indicates  the  end  node. 


if  it  is  possible  to  go  from  every  node  to  every  other  node  of  the  graph  traversing 
the  edges,  not  necessarily  respecting  the  edge  directions.  An  equivalent  definition 
is  that  it  is  weakly  connected  when  there  is  an  undirected  path  between  every  pair 
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of  nodes.  An  undirected  path  V  from  a  node  p\  to  another  node  pm  in  a  graph  Q 
is  an  alternating  sequence  of  finite  number  of  nodes  and  edges  that  start  with  pi 
and  end  with  pm : 

V  =  {pi,ei,p2,e2,...,pi,ei,pi+i  ')•••')  Cm—  1 5  Pm} 

such  that  every  edge  e;  in  the  path  is  incident  on  the  nodes  pi,pi+ 1  adjacent  to  it  in 
the  path,  and  no  edges  or  nodes  are  repeated.  If  a  graph  is  not  weakly  connected, 
it  can  be  decomposed  into  a  number  of  disjoint  subgraphs  such  that  every  one  of 
them  is  weakly  connected.  These  subgraphs  are  called  connected  components  of 
the  graph.  When  every  weakly  connected  component  of  the  measurement  graph 
has  at  least  one  reference  node,  we  say  that  the  graph  is  weakly  connected  to  cVr, 
the  set  of  reference  nodes. 

We  call  the  pair  (Q,  P),  where  Q  is  a  measurement  graph  and  P  :  “E  — >  is 
a  function  that  assigns  measurement  error  covariances  to  the  edges  of  the  graph, 
a  measurement  network. 

The  next  theorem  establishes  conditions  for  the  well-posed-ness  of  the  BLUE 
estimation  problem,  whose  proof  is  provided  in  Section  2.3. 

Theorem  2.2.1.  The  matrix  L  defined  in  (2.7)  for  a  finite  measurement  network 
(Q,  P )  is  invertible,  and  therefore  BLU  estimate  x*  exists  and  is  unique,  if  and  only 
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Figure  2.7.  Dirichlet  Laplacian  for  the  graph  in  Figure  2.6. 
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Consider  the  measurement  graph  Q  with  4  nodes  and  5  edges  shown  in  Fig¬ 
ure  2.6.  Node  1  is  the  reference.  The  incidence  matrix  A  is  therefore  a  4  x  5 
matrix  consisting  of  Os,  Is,  and  —Is.  The  matrix  form  (2.2)  of  the  measure¬ 
ment  equations  (1.1)  for  this  graph  is 


r  i  -i  o  o 

7-/0  o 
0/0-7 
0  /  -/  0 
L0  0  -/  /  J  <£ 
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where  /  is  the  k  x  k  identity  matrix.  The  4  node  variables  in  the  vector  X 
are  related  to  the  5  measurements  in  the  vector  z  by  the  4 k  x  5 k  matrix  A, 
the  expanded  version  of  the  incidence  matrix.  The  measurement  model  (2.4) 
when  node  1  is  the  reference  with  x±  =  0  is 
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The  relationship  between  the  3  unknown  node  variables  in  the  vector  x  are 
related  to  the  known  quantities,  that  is,  measurements  z  and  the  reference 
variable  x±,  by  the  3 k  x  5 k  matrix  Ab- 


Figure  2.8.  An  example  of  a  measurement  graph,  the  BLUE  estimates  and  error 
covariances. 

if  the  measurement  graph  Q  is  weakly  connected  to  its  reference  nodes  cVr.  When  L 
is  non-singular,  the  estimation  error  covariance  matrix  X  :=  E[(x  — x*)(x  — x*)T] 
is  given  by 

X  =  L~\  □ 

The  covariance  matrix  HUt0  for  the  estimation  error  of  a  particular  node  vari- 
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(continued  from  Figure  2.8)  Since  the  graph  Q  is  weakly  connected,  L  is 
invertible.  The  optimal  estimate  of  the  vector  x,  the  solution  to  (2.6),  is  given 
by  x*  =  L~1Ab^P~1z.  From  Figure  2.8,  it  follows  that  the  optimal  estimate 
x*  when  all  measurement  covariance  matrices  are  equal  to  the  identity  matrix 


£2  [4;  -/-/i-1  r-;-;  o  o  o 

£3  =  —i  2/  -i  000  —i  -i 

X*  L  -I  -I  27  J  L  0  0  -I  0  I 


Afp"1 


z  —  Ar  xr 


Note  the  Laplacian-like  structure  of  the  matrix  L.  The  covariance  matrices 
of  the  overall  estimation  error  and  of  the  individual  node- variable  errors  are 


_  1  3/  3/  3/  „  ir  „  (  T 

£  =  -  3/  n  51  ,  £9  =  £3  = 

6  -37  5/7/ .  ’  2  6 


£4  =  -I. 


The  covariance  of  the  estimation  error  of  node  u  is  simply  the  (u—  l)th  diagonal 
block  of  the  covariance  matrix  £. 


Figure  2.9.  Figure  2.8  contd. 

able  xu  appears  in  the  corresponding  k  x  k  diagonal  block  of  £.  A  measurement 
graph,  along  with  the  corresponding  measurement  equations  (2.2)  and  (2.4)  and 
the  node  variable  estimates  computed  from  (2.6),  is  shown  in  figure  2.8. 

Weak  connectivity  of  the  measurement  graph  is  required  not  only  for  the  ex¬ 
istence  of  the  optimal  estimator  of  a  node  variable  xu,  but  also  for  the  existence 
of  any  unbiased  estimator  of  xu.  Before  stating  it  formally,  we  emphasize  the 
distinction  between  an  estimate  and  an  estimator.  Recall  that  a  linear  estimate 
of  a  node  variable  is  a  linear  combination  of  the  measurements  (e,  e  E  *E  specified 
by  a  set  of  coefficient  matrices.  In  particular,  an  estimate  xu  of  a  node  variable 


43 


xu  is  given  by 


Zu  =  J2Ce(e,  (2-9) 

eE(E 

where  the  function  C  :  E  — >  Mfcxfc  specihes  the  coefficients  of  the  measurements. 
In  the  equation  above,  and  in  the  sequel,  for  a  function  /  with  the  edge  set  E  as 
the  domain,  we  use  fe  to  denote  the  value  of  the  function  at  an  edge  e  E  E.  We 
call  the  function  C  the  estimator  of  xu.  It  should  be  stressed  that  the  edge  set 
E  is  implicitly  assumed  to  be  finite.  Otherwise  the  summation  in  (2.9)  will  be  a 
series  and  more  care  has  to  be  exercised  in  defining  an  estimate  or  an  estimator. 
Estimation  in  infinite  graphs  is  considered  in  Chapter  4. 

Now  we  formally  state  the  result  on  the  importance  of  weak  connectivity. 

Lemma  2.2.1.  For  a  finite  measurement  graph  Q  =  (‘V ,  E)  with  a  reference  node 
Vr  c  V,  there  exists  an  unbiased  estimator  for  every  node  variable  xu,  u  G 
if  and  only  if  Q  is  weakly  connected  to  ch’r.  □ 


To  prove  this  result,  we  will  need  the  concept  of  a  flow  in  a  graph.  A  generalized 
flow  from  node  u  G  V  to  node  dgV  with  intensity  j  G  Rfcxfc  is  an  edge-function 
j  :  E  — >  Mfcxfc  such  that 


^  iip/i 
(p,?)e  e 

p=p 


( 


j 


Fpp 

p=p 


p  =  u 

p  =  v  Vp  G  *]/. 
otherwise 


(2.10) 


The  reason  flows  are  useful  in  the  analysis  of  the  estimation  problem  under 
study  is  that  they  precisely  characterize  unbiased  estimators,  which  is  stated  in 
the  next  lemma.  The  proof  of  this  lemma  is  provided  in  Section  2.3. 
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Lemma  2.2.2  (Unbiased  Estimator).  In  a  finite  measurement  network  ( Q,P ) 
with  a  reference  node  o  G  V ,  i.e.,  =  {o},  an  edge  function  j  is  a  linear  unbiased 

estimator  of  a  node  variable  xu  if  and  only  if  j  is  a  flow  of  intensity  I k  from  node 
u  to  the  reference  node  o.  In  this  case,  the  covariance  of  the  error  in  the  estimate 
xu  is  given  by 

E[(x  -  xu)(x  -  XU)T]  =  ^jePeje-  □ 

ee£ 

The  proof  of  Lemma  2.2.1,  which  is  based  on  the  result  above,  is  presented  in 
Section  2.3.  Finally,  the  next  proposition  formally  relates  the  unbiased  estimators 
of  xu  and  the  best  linear  unbiased  estimator  of  xu. 

Proposition  2.2.1.  In  a  finite  measurement  network  ( Q,P )  with  a  a  reference 
node  o  e  if,  for  every  node  variable  xu,  the  best  linear  unbiased  estimator  C  is 
the  flow  C  :  “E  — >  Rfcxfe  of  intensity  from  u  to  o  that  minimizes  the  quadratic 
cost 

trace  jJPeje) 

eE!E 

among  all  flows  j  of  intensity  Ik  from  u  to  o.  □ 

The  proof  follows  from  the  characterization  of  the  BLUE  as  the  unbiased 
estimator  that  minimizes  the  sum  of  the  variances  of  the  estimation  errors  [62] 
and  the  preceding  discussion. 

2.2.2  Dirichlet  Laplacian  and  BLUE 

The  matrix  L  has  a  structure  similar  to  the  Laplacian  matrix  L  of  the  graph 
Q ,  which  is  defined  as  L  :=  AAT  [60].  To  explore  this  connection,  we  first  con¬ 
sider  a  matrix-weighted  graph  Q  whose  edges  have  matrix- valued  weights  (that 
are  symmetric  positive  definite)  associated  with  them,  specified  by  a  function 
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W  :  LL  — >  8fc+.  The  symbol  §fc+  denotes  the  set  of  k  x  k  symmetric  positive  defi¬ 
nite  matrices.  For  a  matrix-weighted  graph  Q  with  weight  function  W,  we  define 
the  generalized,  or  matrix-  weighted,  graph  Laplacian  as 

Jgf  :=  AWAT  E  Rknxkn ,  (2.11) 

where  A  is  the  generalized  incidence  matrix  of  Q  and  W  is  a  block-diagonal  matrix 


with  edge  weights  on  its  diagonal:  W  = 

:  diag(Wi, . 

Expanding  (2.11),  we  get 

Ab 

r 

1 

&  =  AW  A1  = 

w 

AT  AT 

Ar 

L 

J 

I 

> 

> 

_ 1 

ArWAl 

ArWAj. 

For  a  measurement  graph  if  we  assign  edge  weights  as  the  inverses  of  measurement 
error  covariances,  i.e.,  lFe  =  Pe_1  for  every  e  E  *E,  then  W  =  tP^1  and  L  = 
Ab‘P^1A^.  So  L  is  a  principal  submatrix  of  the  generalized  Laplacian  2zf.  We 
call  L  the  generalized  Dirichlet  Laplacian  or  the  generalized  grounded  Laplacian 
of  the  matrix- weighted  graph  Q  with  weight  function  W  and  boundary  <Vr.  We 
will  frequently  refer  to  AbtP~lATT  (and  AcP^lAT)  as  the  Dirichlet  Laplacian  (and 
Laplacian)  for  the  network  ( Q,P ). 

Principal  submatrices  of  the  usual  graph  Laplacian  matrix  are  called  Dirichlet 
Laplacians  since  they  appear  in  the  numerical  solution  of  PDEs  with  Dirichlet 
boundary  conditions.  They  also  appear  in  electrical  network  analysis  when  the 
potential  of  one  or  more  of  the  nodes  is  fixed  at  0,  hence  they  are  also  called 
grounded  Laplacians.  In  fact,  we  will  shortly  see  that  L  plays  a  key  role  in  a 
abstract,  generalized  electrical  network  with  matrix  valued  currents  and  voltages. 
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We  list  below  a  few  properties  of  the  generalized  Laplacian  and  the  incidence 
matrix  that  will  be  used  in  establishing  certain  results  later  in  the  paper. 

Proposition  2.2.2.  Let  Q  be  a  measurement  graph,  A  its  incidence  matrix  and 
Ab  be  the  basis  incidence  matrix  constructed  by  removing  the  rows  corresponding 
to  the  reference  nodes  from  A.  The,  the  following  statements  are  true: 

1.  If  Q  is  weakly  connected  and  has  n  nodes,  then  the  rank  of  A  is  n  —  1,  and 
1 T A  =  0,  where  1  e  Rn  is  a  vector  of  all  1  ’s. 

2.  The  basis  incidence  matrix  Ab  has  full  row  rank  if  and  only  if  every  weakly 
connected  component  of  the  graph  has  at  least  one  reference  node. 

3.  Jzf  has  at  least  k  zero  eigenvalues.  It  has  exactly  k  zero  eigenvalues  if  and 
only  if  Q  is  weakly  connected. 

4 ■  x  (1  <g)  Ik)  =0.  □ 

The  last  two  statements  are  direct  consequences  of  the  first  two,  whose  proofs 
are  contained  in  the  proof  of  Theorem  2.2.1. 

Remark  2.2.1  (Role  of  edge  directions) .  Note  that  the  graph  Laplacian  L  =  AAT 
does  not  depend  on  the  directions  of  the  edges  [60].  Since  T  is  also  independent 
of  the  edge-directions,  it  follows  from  the  definition  (2.11)  that  the  generalized 
Laplacian  T£  is  also  independent  of  the  edge  directions  chosen.  Clearly,  the  matrix 
L ,  being  a  submatrix  of  T£ ,  shares  this  property,  too.  Since  the  BLUE  error 
covariances  are  given  by  the  inverse  of  L  (cf.  Theorem  2.2.1),  they  do  not  depend 
on  the  edge  directions.  Therefore,  as  long  as  we  are  interested  only  in  the  BLUE 
covariances,  we  can  regard  the  measurement  graph  as  undirected,  ffowever,  the 
optimal  estimator  of  a  node  variable  does  depend  on  the  edge  directions.  □ 
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2.2.3  Role  of  parallel  edges 


Two  edges  e\  and  e2  are  said  to  be  parallel  if  they  are  incident  on  the  same  set 
of  nodes  (irrespective  of  the  direction  of  the  edges).  Parallel  edges  may  be  present 
in  a  measurement  graph,  e.g.,  when  a  measurement  QlhV  is  obtained  by  one  of 
the  nodes  between  u  and  v,  and  the  measurement  (v>u  is  obtained  by  the  other 
node.  Such  a  situation  can  occur  when  two  nodes  measure  each  other’s  relative 
position  by  measuring  range  and  angle,  as  described  in  Section  2.1.1.  Parallel 
edges  may  also  appear  when  multiple  relative  measurements  between  the  same 
pair  of  nodes  are  obtained  over  time,  and  all  these  measurements  are  used  to 
define  the  measurement  graph. 

However,  a  measurement  network  with  parallel  edges  can  be  reduced  to  one 
without  parallel  edges,  by  replacing  relative  measurements  on  parallel  edges  with 
a  single  measurement  of  appropriate  covariance,  so  that  the  BLU  estimates  and 
their  associated  error  covariances  don’t  change.  Imagine  there  are  t  parallel 
edges  ei,  e2, . . .  between  a  pair  of  nodes  u  and  v,  with  associated  measurements 

Cl, _ ,  Ce,  anfl  covariances  P\ ,  P2,  •  •  •  Pi-  We  can  replace  these  £  parallel  edges 

with  a  single  edge  e'  :=  (u,v)  with  associated  with  associated  measurement  (e 
and  measurement  error  covariance  Pe>  that  are  given  by 

P*1  :=  Pf 1  +  •  •  •  +  Pe~\ 

Ce'  :=  (Pi  1  + - X)  1(s(l)  e')P1  xCi  H - +  s(£,  e')Pe  1Q), 

where  s(e,  e!)  =  +1  if  the  orientations  of  e  and  e'  are  the  same,  and  s(e,  e')  =  — 1 
if  the  orientations  are  opposite.  Two  parallel  edges  e\  and  e2  are  said  to  have 
the  same  orientation  if  both  are  directed  away  from  the  same  node,  otherwise 
their  orientations  are  opposite.  Replacing  a  set  of  parallel  edges  by  a  single  edge 
according  to  this  procedure  leaves  the  BLU  estimates  and  the  BLLTE  covariances 
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of  all  the  node  variables  invariant.  One  could  prove  this  fact  by  straightforward 
but  tedious  manipulations;  so  a  formal  proof  is  omitted.  Therefore,  without  any 
loss  of  generality,  we  can  assume  that  a  measurement  graph  does  not  contain  any 
parallel  edges. 


2.3  Proofs 

Proof  of  Theorem  2.2.1.  We  will  first  consider  the  case  when  Q  has  only  one  con¬ 
nected  component  and  prove  that  L  is  invertible  if  and  only  if  the  graph  has  at 
least  one  reference  node.  When  Q  is  weakly  connected,  the  rank  of  its  incidence 
matrix  rank(A)  —  n  —  1,  where  n  is  the  number  of  nodes  [61].  If  Q  has  no  refer¬ 
ence  nodes,  then  A  =  Ab,  which  makes  Ab,  and  thereby  Ab,  rank  deficient.  Then 
L  =  Ab‘P~1Ab  is  singular.  On  the  other  hand,  any  sub  matrix  obtained  from  A  by 
removing  one  or  more  rows  has  full  row  rank  [61].  Any  smaller  submatrix  must 
obviously  be  full  row  rank.  Now  if  the  weakly  connected  graph  Q  has  at  least 
one  reference  node,  Ab,  is  full  row  rank  by  the  previous  argument,  and  so  is  Ab- 
To  prove  that  L  is  non-singular,  assume  that  Ax  ^  0  s.t.  xT(AblP~1A[ )x  =  0. 
Since  £  is  symmetric  positive  definite,  this  implies  P^1^2A^x  =  0,  where  P  ^l'2 
is  the  unique  positive  definite  square  root  of  IP-1.  Therefore  A^x  =  0,  which  is 
a  contradiction.  This  proves  that  when  Q  is  weakly  connected,  L  is  invertible  if 
and  only  if  there  is  at  least  one  reference  node. 

To  examine  the  situation  when  Q  has  more  than  one  weakly  connected  compo¬ 
nents,  assume  w.l.o.g.  that  it  has  two  components  Q \  =  (Ti,  £i)  and  Q2  = 
(T2,  £2).  Since  the  two  components  cannot  have  an  edge  or  a  node  in  common, 
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the  generalized  incidence  matrix  A  of  Q  can  be  written  as 

Ai  0 

A  =  , 

0  A‘2 

where  Ai  is  the  generalized  incidence  matrix  of  the  component  Q,.  Similarly, 

A\ tb  0 

Ab  =  , 

0  •'4.2,6 

where  Aub  correspond  to  the  component  Qi.  As  a  result  the  matrix  L  =  Ab^P^Af 
for  Q  can  be  written  as 

_  Ab^Alb  0 

0  M.b^Alb 

where  1  contains  all  the  edge-covariance  matrices  belonging  to  the  edges  in  Qi. 
If  one  of  the  components,  say  Q\  does  not  have  a  reference  node,  then  A\^  =  A\ 
and  so  A\j,Pfl  Af  b  is  singular,  which  makes  L  singular.  If  both  components  have 
at  least  one  reference  node  each,  each  of  the  diagonal  blocks  of  L  is  non-singular, 
which  makes  L  invertible.  This  proves  the  theorem.  ■ 

Proof  of  Lemma  2.2.2.  By  definition,  a  linear  estimate  of  the  node  variable  xu  in 
the  finite  network  ( Q ,  P )  is  given  by 

xu  =  ^jJCe 

for  some  matrices  {je,e  G  £}.  Therefore, 

Xu  —  ^  jp,q{Xp  Xq  +  £p}q)  (2.13) 

(p,g)e£ 
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which  implies  that 


Efo.]  =  X  ip, 

(p,9)eiE 


q\Xp  Xq) 


X  ip,qXP~  X  ip,-X' 


q^q 


(p,?)e£ 


(p,<?)e£ 


X  X  jLxp  -  X  X  il 


qxq 


P&V  (p,g)e£ 
p=p 


q&V  (p,?)e£ 

q=q 


X  Av)t  -  (X:£V(  XI  ip>> 


*)T-(Ex*  X  ^)T 

pe'i''  (p,?)e£  ?e£'  ip,q)&E 

p=p  <?=g 


(2.14) 


If  j  is  a  flow  with  intensity  /*,  from  u  to  o,  using  (2.10)  we  conclude  that  the  first 
term  above  can  be  expressed  as 


(X4X  ip>' 

peV  (p,q)&£ 
p=p 


=  (x„  -  xn 


=  (Xu  -  X„ 


+  (Xaf  X  i™ 

p&1;  (<?,p)6£ 

p=p 

+  (Ex*  X  jp,q 

q&V  (p,g)e£ 

q=q 


Combining  this  with  (2.14),  we  get  E[xu]  =  (xu  —  x0)  =  xu,  because  xa  =  0,  which 
proves  sufficiency. 

If  j  is  not  a  flow,  there  is  at  least  one  node,  say  r  €  ‘h7,  where  the  flow  condi¬ 
tion  (2.10)  is  violated.  Assume  for  the  moment  that  r  is  neither  u  nor  o.  We 
rewrite  (2.13)  as 

E[£«]  =  X  ip,q(xP  ~xq)+  X  ip,q(xP  -Xq)  +  T 

(p,«)e£  (p,q)  6£ 

p=r 


where  T  denotes  the  remaining  terms  of  the  sum  and  does  not  involve  xr, 


—  ( y "  jP,q  y "  jp,q )  + ^2, 

(p,g)e£  (p,<?)  e£ 

p=r  <y=r 
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where  the  terms  constituting  T-2  also  do  not  involve  xr.  Since  the  flow  condi¬ 
tion  (2.10)  is  not  satisfied  at  r,  the  coefficient  of  xr  above  is  not  zero  and  so  xu  is 
biased.  The  same  proof  technique  can  be  applied  to  the  case  when  r  is  either  u 
or  o,  which  proves  necessity. 

If  j  is  an  unbiased  estimators  of  xu  in  the  finite  network  (Q,  P),  the  covariance  of 
the  estimation  error  is 

\£\  |£| 

E[(x«  -  xu)(xu  -  xu)T }  =  E[(^2jfei)(Y^  jJei)T] 

1=1  t=  1 

m  m 

Eu,0  =  J2jfE[eie[]ji  =  jfPiju 

i=i  i=i 

where  the  second  inequality  was  obtained  by  using  the  fact  that  the  measurement 
errors  on  different  edges  are  uncorrelated.  This  proves  the  second  statement  of 
the  lemma.  ■ 


Now  we  are  ready  to  prove  Lemma  2.2.1. 


Proof  of  Lemma  2.2.1.  Without  loss  of  generality,  assume  that  there  is  only  one 
reference  node:  lfr  =  {o}.  If  Q  is  weakly  connected,  we  can  construct  an  undi¬ 
rected  path  V  from  u  to  o  and  define  an  edge-function  j  as  follows: 


j 


jepath  =  < 


-J 


0 


e  G  V,  e  =  V 
e  G  V,  e  V 
e  V 


where  e  =  V  means  that  the  orientation  of  the  edge  e  is  the  same  as  the  orientation 
of  the  path  V,  and  e  V  means  the  orientations  are  opposite.  The  orientation  of 
an  edge  e  in  a  path  V  —  . . .  ,p,  e,  q, . . .  is  said  to  be  the  same  as  the  orientation 
of  the  path  if  e  =  (p,  q).  If  e  =  (g,p),  the  orientation  of  the  edge  is  opposite  to 
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that  of  the  path.  It  is  straightforward  to  see  that  j  is  a  flow  of  intensity  4  from 
u  to  o,  and  therefore  by  Lemma  2.2.2,  j  is  an  unbiased  estimator  of  xu. 

If  Q  is  not  weakly  connected,  it  can  be  decomposed  into  a  number  of  disjoint 
subgraphs  such  that  every  one  of  them  is  weakly  connected.  These  subgraphs 
are  called  weakly  connected  components  of  Q.  Pick  such  a  weakly  connected 
component  that  does  not  contain  the  node  o,  call  it  Q\  =  (4i,  £1),  and  pick  an 
arbitrary  node  u  in  Q\.  By  contradiction,  assume  that  there  exists  a  flow  of  matrix 
intensity  Ik  from  u  to  o.  Let  A\  be  the  generalized  incidence  matrix  of  Q\ .  Let  Q\ 
consist  of  N  nodes  and  M  edges,  and  without  loss  of  generality,  let  u  be  numbered 
as  node  1.  Define  J  :=  ■  ■  . ,  jJ,f]T  G  as  the  tall  matrix  of  the  flows 

on  the  edges  in  the  component  Q\  and  u>  =  [Ik,  0, . . . ,  0]T  G  RkNxk  with  R  in  the 
1st  A;  x  A;  block  position,  and  0  everywhere  else.  Then  the  conservation  law  (2.10) 
can  now  be  expressed  compactly  as 


AiJ  =  us.  (2.15) 

Now  define  1  :=  [Ik,  Ik  ■  ■  ■ ,  h]T  G  and  multiply  both  sides  of  the  equation 

above  by  1T,  which  is  equivalent  to  adding  all  the  rows.  It  follows  from  Proposi¬ 
tion  2.2.2  that  the  sum  of  each  row  in  A\  is  0.  Therefore  we  obtain  the  following 
contradiction 


1tAiJ=1tu>  =>  0  =  4.  (2.16) 

Thus  no  flow  of  intensity  4  from  u  to  o  is  possible.  The  result  then  follows  from 
Lemma  2.2.2.  ■ 
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Chapter  3 


Distributed  algorithms  for 
optimal  estimation 


In  this  chapter  we  answer  the  distributed  algorithm  question  raised  in  Sec¬ 
tion  1.1,  concerning  computation  of  the  optimal  estimates  of  the  node  variables  in 
a  distributed  way.  We  show  that  this  objective  is  indeed  feasible,  and  present  two 
distributed  asynchronous  algorithms  that  achieve  this  goal.  The  algorithms  are 
iterative,  whereby  every  node  starts  with  an  arbitrary  initial  guess  for  its  variable 
and  successively  improves  its  estimate  by  using  two  pieces  of  local  information: 
the  measurements  available  to  it  as  well  as  the  estimates  of  the  nearby  nodes.  The 
latter  can  be  obtained  by  communicating  with  the  nearby  nodes.  The  algorithms 
are  guaranteed  to  converge  to  the  optimal  estimate  when  the  number  of  iterations 
goes  to  infinity,  as  long  as  certain  conditions  of  the  inter-node  communication  are 
satisfied.  The  second  algorithm  is  designed  to  have  a  faster  convergence  rate  com¬ 
pared  to  the  first.  Both  algorithms  are  robust  to  link  failures,  and  they  converge 
even  in  the  presence  of  temporary  faults  of  nodes  and  communication  links. 
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Both  algorithms  require  each  node  to  have  embedded  communication  and  com¬ 
putation  capability.  As  a  result,  these  algorithms  are  not  applicable  to  the  the 
case  when  some  of  the  nodes  in  the  measurement  graph  are  not  physical  enti¬ 
ties.  An  example  of  such  an  application  is  the  mobile  robot  localization  problem 
discussed  in  Section  2.1.4.  Moreover,  the  measurement  graph  is  assumed  fixed 
in  time.  In  some  applications,  such  as  the  sensor  network  localization  problem 
discussed  in  Section  2.1.1,  this  will  mean  the  node  are  static.  In  other  applica¬ 
tions,  a  time-invariant  measurement  graph  does  not  preclude  motion  of  the  nodes. 
For  example,  in  both  the  time-synchronization  and  heading  estimation  problems 
discussed  in  Section  2.1.2  and  Section  2.1.3,  nodes  can  be  moving  without  intro¬ 
ducing  any  time  variation  in  the  measurement  graph.  In  this  chapter  we  only 
consider  measurement  graphs  that  are  time-invariant. 

Organization:  Section  3.1  describes  the  constraints  on  computation  and  inter¬ 
node  communication  that  an  algorithm  must  abide  by  in  order  to  qualify  as  a  dis¬ 
tributed  algorithm.  In  Section  3.3  we  describe  the  Jacobi  algorithm  and  analyze  it 
properties,  including  correctness,  convergence  rate,  computation-communication 
trade-off  and  the  effect  of  asymmetry  in  inter-node  communication.  Section  3.4 
describes  the  OSE  algorithm  and  analyzes  its  convergence  properties.  In  sec¬ 
tion  3.5  we  specifically  discuss  the  effects  of  asymmetric  communication.  The 
chapter  concludes  with  a  discussion  of  open  issues  in  Section  3.6. 


3.1  Problem  statement 

Since  the  optimal  estimate  is  the  solution  the  system  of  linear  equations  (2.6): 
Ax  =  b,  we  seek  iterative  algorithms  to  compute  its  solution.  We  assume  that 
every  node  u  G  “V  is  a  physical  entity  with  a  unique  identifier  that  has  the 
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capability  to  carry  out  computations  and  communicate  information  with  a  set 
of  nearby  nodes.  We  take  the  index  of  a  node  (nodes  are  indexed  from  1  through 
n,  where  n  is  the  number  of  nodes)  as  the  unique  identifier  of  the  node.  An 
algorithm  qualifies  as  a  distributed  algorithm  only  if  it  satisfies  the  constraint 
that  every  node  computes  its  own  estimate  and  the  information  needed  to  carry 
out  the  computation  is  obtained  by  communication  with  its  nearby  nodes. 

In  order  to  describe  the  phrase  “communication  with  nearby  nodes”  precisely, 
we  define  the  communication  graph  Qc  =  ('T7,  £c)  associated  with  the  measure¬ 
ment  graph  Q  =  (‘T7,  £),  which  is  a  directed  graph  that  consists  of  the  same  node 
set  as  the  measurement  graph  but  with  a  (typically)  different  edge  set,  whose 
edge  directions  determine  which  nodes  can  receive  information  from  which  other 
nodes.  In  particular,  a  node  u  can  receive  information  from  another  node  v  if 
and  only  if  there  is  an  edge  (v,u)  G  tEc.  Note  that  when  an  edge  (v,u)  ex¬ 
ists  in  the  communication  graph,  the  reverse  edge  (u,v)  may  not  exist,  in  which 
case  u  can  receive  messages  from  v  but  not  vice  versa.  Such  asymmetry  is  quite 
common  in  wireless  communication  [63].  Asymmetry  in  communication  could 
be  caused,  especially  in  ad-hoc  wireless  networks,  due  to  inhomogeneous  inter¬ 
ference,  packet  collisions,  and  imperfect  sleep  scheduling  arising  from  inaccurate 
time-synchronization.  Communication  between  two  nodes  u  and  v  is  called  sym¬ 
metric  if  and  only  if  both  (u,  v )  and  (v,  u )  belong  to  “Ec. 

One  has  to  keep  in  mind,  though,  that  in  certain  situations,  a  relative  mea¬ 
surement  between  a  pair  of  nodes  can  be  obtained  only  when  the  communication 
between  them  is  symmetric.  The  relative  clock  offset  measurement  technique 
described  in  Section  2.1.2  (case  A.)  is  one  such  example.  However,  there  are 
situations  when  relative  measurements  can  be  obtained  without  symmetric  com¬ 
munication.  The  relative  positions  obtained  from  range  and  bearing  measure- 
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merits  described  in  Section  2.1.1,  and  relative  clock  offset  measurements  by  the 
RBS  method  of  [1],  described  in  Section  2.1.2,  are  such  examples.  Therefore  it 
is  important  to  allow  the  possibility  of  asymmetric  communication  in  designing 
distributed  algorithms. 

Now  we  can  describe  precisely  what  we  mean  by  a  distributed  algorithm.  An 
iterative  algorithm  devised  to  compute  the  optimal  estimate  x*  is  called  distributed 
if  it  satisfies  the  following  constraints: 

Constraint  3.1  (Distributed).  1.  Every  node  has  knowledge  of  the  relative  mea¬ 
surements  (and  associated  error  covariances)  corresponding  to  the  edges  in 
the  measurement  graph  that  are  incident  on  itself. 

2.  At  every  iteration,  each  node  is  allowed  to  receive  a  message  from  the  nodes 
in  its  1  -hop  in-neighborhood  in  the  communication  graph ,  J\f(0  which  is  de¬ 
fined  as: 

AC  =  {^e^,(u,u)G2:c}.  (3.1) 

3.  Each  node  is  allowed  to  perform  computations  involving  only  variables  that 
are  local  to  the  node  or  that  were  previously  obtained. 

The  following  assumptions  are  assumed  to  hold  for  every  distributed  algorithm 
considered  in  this  chapter. 

Assumption  3.1.1.  1.  The  measurement  graph  Q  =  (‘V .  ‘E)  is  weakly  con¬ 

nected  with  respect  to  its  reference  nodes  and  does  not  contain  parallel 
edges. 

2.  The  communication  graph  Qc  =  (T'’,  “Ec)  is  such  that  for  every  pair  of  nodes 
that  have  a  measurement  edge  between  them,  there  is  at  least  one  commu¬ 
nication  edge  between  them. 
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3.  If  there  is  no  measurement  edge  between  a  pair  of  nodes,  then  there  is  no 
communication  edge  between  them,  and  no  communication  edge  is  directed 
toward  a  reference  node. 

4.  Every  node  that  is  not  a  reference  node  has  at  least  one  communication  edge 

directed  toward  it.  □ 

The  assumption  of  not  having  parallel  edges  is  not  restrictive  because  multiple 
measurements  between  the  same  pair  of  nodes  can  be  combined  into  a  single 
measurement  (see  Section  2.2.3).  The  second  condition  ensures  that  the  nodes 
employing  the  algorithm  will  be  able  to  use  all  the  available  measurements.  The 
third  condition  clarifies  that  the  communication  graph  is  used  only  to  model  the 
information  exchange  that  occurs  during  the  execution  of  the  algorithm.  The 
fourth  condition  ensures  that  every  node  (other  than  a  reference  node)  is  able  to 
receive  messages  from  at  least  one  neighbor,  since  otherwise  it  cannot  update  its 
estimate. 

The  communication  graph  Qc  is  called  symmetric  if  whenever  (u,  v)  G  “Ec, 
where  w,  v  are  not  reference  nodes,  we  also  have  (v,  u )  G  tEc.  If  there  is  at  least  one 
communication  edge  (u,v)  G  “Ec  such  that  (v,u)  ^  “Ec,  then  the  communication 
graph  Qc  is  called  asymmetric.  Recall  that  an  edge  e  (in  Q  or  Qc )  between  two 
nodes  u  and  v  is  said  to  be  incident  on  both  the  nodes  u  and  v,  which  is  denoted 
by  e  ~  u  and  e  ~  v,  respectively,  whether  the  edge  is  directed  from  u  to  v  or 
otherwise. 

Figure  3.1  shows  a  measurement  graph  Q  and  an  associated  communication 
graph  Qc.  The  lack  of  communication  from  2  to  1  does  not  introduce  asymmetry 
since  the  reference  node  1  does  not  use  any  information  from  its  neighbors  (see 
Assumption  3.1.1).  According  to  our  terminology  the  communication  graph  in 
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(a)  g  (b)  gc 

Figure  3.1.  A  measurement  graph  and  a  symmetric  communication  graph  associ¬ 
ated  with  it.  Though  there  is  no  communication  edge  from  2  to  1,  that  is  not  a 
cause  of  asymmetry  by  Assumption  3.1.1. 


(a)  G  (b)  gc 

Figure  3.2.  A  measurement  graph  and  an  asymmetric  communication  graph  asso¬ 
ciated  with  it.  The  asymmetry  in  the  communication  graph  comes  from  the  lack 
of  a  communication  edge  from  node  3  to  node  4. 

the  figure  is  therefore  symmetric.  Figure  3.2  shows  another  example  of  a  mea¬ 
surement  graph  and  a  communication  graph,  where  the  communication  graph  is 
now  asymmetric.  The  asymmetry  comes  from  the  lack  of  a  communication  edge 
(3,4),  which  means  that  node  3  can  receive  broadcasts  from  4  but  not  the  other 
way  around. 

Given  an  iterative  algorithm,  let  =  [x^ ,  ■  ■  ■ ,  xhjT]T ,  where  rib  is  the  num¬ 
ber  of  non-reference  nodes,  be  the  vector  of  node  estimates  after  the  Ah  iteration 
of  the  algorithm  has  been  completed.  One  iteration  is  said  to  be  complete  when 
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all  nodes  update  their  estimate  once.  The  algorithms’  error  at  the  ith  iteration 
with  respect  to  the  BLU  estimate  x*  is 


*(*)  •— xh) 


(3.2) 


The  algorithm  is  said  to  be  correct  if  the  error  — >  0  as  i  — >  oo  for  every  initial 

condition  x(0h  We  also  define  the  error  ratio  at  the  ith  iteration 


£ 


(*) 


(3.3) 


The  number  of  iterations  i  required  so  that  the  error  ratio  eW  attains  a  value 
lower  than  e,  denoted  by  nitei.(e),  is  used  as  a  measure  of  the  convergence  rate  of 
the  algorithm.  The  error  ratio  is  particularly  useful  in  comparing  the  convergence 
rate  of  two  algorithms  that  start  with  the  same  initial  estimates. 


3.2  Contribution  and  prior  work 

In  this  chapter,  we  propose  two  distributed  algorithms,  namely,  the  Jacobi 
algorithm  and  the  overlapping  subgraph  estimator  (OSE)  algorithm,  to  compute 
the  BLU  estimates  of  the  node  variables  from  the  relative  measurements.  The 
algorithms  are  distributed  in  the  sense  that  they  satisfy  Constraint  3.1.  Parts  of 
the  this  chapter’s  material  have  been  reported  in  the  papers  [64-68].  The  results 
of  this  chapter  are  summarized  below. 

1.  We  show  that  the  Jacobi  algorithm  converge  to  the  optimal  estimate  when 
the  communication  graph  is  symmetric.  When  the  communication  graph 
has  asymmetry,  then  the  algorithm  converges  to  a  sub-optimal  estimate  as 
long  as  certain  conditions  on  the  communication  graph  are  satisfied.  These 
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conditions  have  to  do  with  the  flow  of  information  from  the  reference  nodes 
to  rest  of  the  nodes. 

The  number  of  iterations  required  by  the  the  Jacobi  algorithm  to  make  the 
error  ratio  lower  than  a  specified  value  is  established  in  terms  of  algebraic 
properties  of  the  graph. 

The  Jacobi  algorithms  is  proved  to  be  robust  to  temporary  node  and  commu¬ 
nication  failures  and  to  asynchronous  computation.  That  is,  the  estimates 
produced  by  the  algorithm  when  time  goes  to  infinity  does  not  change  even 
if  the  nodes  communicate  and  update  their  estimates  in  an  asynchronous 
manner,  and  some  nodes  and  communication  edges  fail  temporarily.  A  spe¬ 
cial  structure  of  the  measurement  error  covariance  matrices  were  assumed 
(described  in  Assumption  3.3.1). 

2.  The  OSE  algorithm  is  proved  to  converge  to  the  optimal  estimate  when 
the  communication  graph  is  symmetric.  The  algorithm  is  also  shown  to 
be  robust  to  temporary  communication- link  and  node  failures,  under  the 
assumption  of  a  special  structure  of  covariance  matrices. 

3.  We  compare  the  energy  consumption  of  three  algorithms,  Jacobi,  EPA  [69], 
and  OSE,  when  nodes  exchange  information  through  wireless  communica¬ 
tion.  Through  simulations  with  a  simple  yet  realistic  model  of  energy  con¬ 
sumption,  we  see  that  the  OSE  algorithm  can  drastically  cut  down  the  total 
energy  expended  to  reach  within  a  specified  level  of  the  optimal  estimates. 

4.  We  also  show  that  in  the  presence  of  communication  asymmetry,  it  is  im¬ 
possible  to  design  a  distributed  algorithm  satisfying  the  Constraint  3.1. 
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Convergence  results  for  both  the  algorithms  with  asynchronous  communication 
have  been  obtained  under  the  assumption  that  the  measurement  error  covariances 
have  a  special  structure  (described  in  Assumption  3.3.1).  Simulations  indicate 
that  the  algorithms  converge  even  when  the  assumption  is  violated,  but  a  proof 
is  still  lacking. 

Prior  work:  Although  the  problem  of  localization  has  been  extensively  studied 
in  the  last  10  years,  fueled  by  the  explosive  interest  in  sensor  networks,  most  of  the 
work  on  the  topic  has  concentrated  on  estimating  node  locations  from  range  mea¬ 
surements  alone,  and  a  few  on  estimating  node  locations  from  angle  measurements 
alone  [39,  43-50].  Estimating  locations  from  both  range  and  angle  measurements, 
which  is  equivalent  to  localization  from  relative  position  measurements,  have  at¬ 
tracted  attention  only  recently  (see  [52]  and  [51]).  However,  distributed  algorithms 
to  compute  the  BLUE  location  estimates  have  not  investigated. 

There  is  a  rich  literature  on  time-synchronization  in  a  network  of  processors. 
The  NTP  (Network  Time  Protocol)  is  a  forerunner  in  time  synchronization  pro¬ 
tocols  developed  for  the  wired  Internet,  but  is  less  desirable  for  wireless  sensor 
networks  due  to  its  high  energy  consumption  [70].  Many  protocols  have  been 
developed  in  recent  year  for  wireless  networks,  which  include  the  RBS (Reference 
Broadcast  System)  [1],  the  TPSN(Timing-sync  Protocol  for  Sensor  Networks)  [71], 
and  the  FTSP(The  Flooding  Time  Synchronization  Protocol)  [56],  to  name  a  few. 
However,  none  of  these  synchronization  protocols  attempt  to  compute  the  opti¬ 
mal  estimates  of  clock  skews  and  offsets;  rather  they  use  a  single  path  from  a 
node  to  the  reference  to  estimate  those  node  variables.  For  example,  in  TPSN 
protocol  [71],  nodes  close  to  a  root  node,  called  level  1  nodes,  synchronize  their 
clocks  to  the  root  node’s  clock  by  using  relative  offset  measurements.  The  nodes 
close  to  level  1  nodes,  called  level  2  nodes,  in  turn  synchronize  their  clocks  to  the 
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level  1  nodes’  clocks,  and  so  on  until  all  the  nodes  are  synchronized.  See  [72]  for  a 
review  of  time  synchronization  protocols  for  sensor  networks.  To  the  best  of  our 
knowledge,  Karp  et  al.  [73]  were  the  first  to  allude  to  a  distributed  algorithm  for 
computing  the  optimal  clock  offset  estimates.  However,  the  algorithm  was  merely 
suggested,  not  analyzed  in  [73].  A  distributed  algorithm  was  proposed  in  [74]  later 
for  estimating  time-offsets  from  all  available  relative  offset  measurements. 

The  EPA  algorithm  proposed  by  Delouille  et  al.  [69],  though  for  a  completely 
different  application,  can  also  be  used  to  compute  the  BLU  estimates  in  an  dis¬ 
tributed  manner.  We  compare  the  Jacobi  and  OSE  algorithms  to  the  EPA  algo¬ 
rithm  through  simulations  in  Sections  3.3.3  and  3.4.1. 

Since  the  optimal  estimate  is  a  solution  of  a  system  of  linear  equations  (2.6),  we 
exploit  iterative  techniques  of  solving  linear  equations,  which  has  a  long  and  rich 
history  [75].  In  fact,  the  algorithms  proposed  by  both  Karp  et  al.  [73]  and  Girid- 
har  and  Kumar  [74]  for  distributed  time  synchronization  are  based  on  the  Jacobi 
method  of  iteratively  solving  linear  equations  [75].  Our  first  algorithm  for  comput¬ 
ing  the  optimal  estimates  from  relative  measurements  is  also  based  on  the  Jacobi 
method.  In  that  respect,  the  Jacobi  algorithm  proposed  in  this  dissertation  is 
not  novel.  However,  we  provide  a  thorough  analysis  of  the  effect  of  asymmetric 
communication  on  the  Jacobi  algorithm’s  convergence  properties.  In  contrast,  in 
the  algorithm  proposed  by  Giridhar  and  Kumar  [74]  for  estimating  time-offsets, 
the  effect  of  such  asymmetry  was  overlooked.  The  EPA  algorithm  proposed  in  [69] 
is  a  block-Jacobi  method  [75]  of  solving  linear  equations. 

Apart  from  the  Jacobi  method,  there  are  many  other  iterative  methods  of  solv¬ 
ing  linear  equations,  some  of  them  having  a  faster  convergence  rate,  such  as  Gauss 
-  Siedel,  SOR,  and  conjugate  gradient  methods  [75].  However,  not  all  of  these 
methods  are  applicable  in  devising  distributed  algorithms,  since  the  information 
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required  by  these  algorithms  to  carry  out  the  computations  may  demand  an  un¬ 
acceptable  level  of  communication  between  nodes  or  the  interference  by  a  central 
authority.  For  example,  the  Gauss-Siedel  method  requires  that  the  variables  be 
updated  in  a  specific  sequence.  In  a  network  of  devices  that  exchange  information 
with  one  another  thorough  wireless  communication,  ensuring  such  an  order  while 
satisfying  the  Constraint  3.1  may  be  quite  difficult.  However,  the  Weighted  Ad¬ 
ditive  Schwarz  method  [76]  offers  potential  for  distributed  implementation.  The 
OSE  algorithm  described  in  this  dissertation  is  closely  related  to  the  multisplitting 
and  Weighted  Additive  Schwarz  method  of  solving  linear  equations  [76]. 


3.3  Jacobi  algorithm 

In  the  Jacobi  algorithm,  a  node  obtains  multiple  estimates  of  its  own  variable 
by  adding  the  appropriate  relative  measurements  to  its  neighbors’  estimates.  It 
then  computes  the  new  estimate  of  its  variable  by  taking  a  weighted  average  of 
those  estimates.  To  describe  the  algorithm,  we  first  define  the  1  -hop  measurement 
neighborhood  of  u,  denoted  by  A fu,  as  the  set  of  nodes  with  which  u  shares  a 
measurement  edge  (irrespective  of  the  direction): 

J\fu  =  {v  e  J/'|(w,u)  G  £  or  (v,u)  G  £}  =  {v  G  £'[  dg(u,v)  =  1},  (3.4) 

where  dg  is  the  graphical  distance  between  the  nodes  u  and  v,  which  is  the  number 
of  edges  that  have  to  be  traversed  in  going  from  one  to  the  other.  The  graphical 
distance  is  evaluated  without  regards  to  the  edge  directions.  The  Jacobi  algorithm 
for  computing  the  optimal  estimates  of  the  node  variables  is  an  iterative  algorithm 
that  operates  as  follows  for  each  node  u  G  \Jfr. 

1.  node  u  picks  an  arbitrary  initial  estimate  for  the  node  variables  xv,  v  G 
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{«}  UACC,  he.,  for  itself  and  those  of  its  neighbors  from  which  it  can  receive 
messages.  The  neighbor  set  J\f£  in  the  communication  graph  was  defined 
in  Constraint  3.1.  These  estimates  need  not  be  consistent  across  different 
nodes.  The  reference  nodes  start  at  their  known  values. 

2.  at  the  ith  iteration,  node  u  assumes  that  the  current  estimate  x$  for  the 
node  variable  xv  of  each  communication  neighbor  v  E  J\f£  is  correct  and 
updates  its  own  estimate  by  solving  the  following  equation: 

(  Pe')xu+1)  =  ^  (Ze\u  +  au,e Ce)  ,  (3-5) 

ee£u(l)  ee£u(l) 

where  e\u,  for  an  edge  e  incident  on  u,  denotes  the  “other  end”  of  e1,  and 
aue  is  the  (n,e)th  entry  of  the  incidence  matrix  ^(defined  in  Section  2.2.1), 
and  £u(  1)  is  the  set  of  edges  in  the  measurement  graph  Q  that  are  incident 
on  the  node  u  such  that  there  are  communication  edges  from  their  other 
ends  toward  u  : 

£u(l)  :=  (e  G  “E  |  e  ~  u,  (e  \  u,  u)  G  “Ec }.  (3.6) 

The  arrow  in  the  notation  is  used  to  emphasize  that  the  measurement  edges 
in  £it(l)  depend  on  the  direction  of  the  edges  in  the  communication  graph. 
After  the  computation,  node  u  broadcasts  the  new  estimate  £u  +  l)  to  all  its 
neighbors  that  can  receive  messages  from  it. 

3.  At  the  end  of  the  ith  iteration,  node  u  listens  for  the  broadcasts  from  its 
1-hop  in-neighbors,  which  are  used  to  update  the  node  variable  estimate 
Xv +1^  for  each  v  E  Afy.  Once  all  updates  are  received,  a  new  iteration  can 
start. 

1That  is,  if  e  =  ( v ,  u ),  then  e\u  =  v  and  e\v  =  u. 
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These  iterations  can  be  terminated  at  a  node  when  the  change  in  its  recent 
estimate  is  seen  to  be  lower  than  a  pre-specified  threshold  value  or  a  pre-specified 
maximum  number  of  iterations  are  completed.  Figure  3.3  shows  the  relevant 
equations  for  one  iteration  of  the  Jacobi  algorithms  applied  to  the  measurement 
graph  shown  in  Figure  2.6. 

To  gain  insight  into  the  Jacobi  algorithm,  imagine  for  the  moment  that  when 
node  u  receives  from  its  communication  neighbors  their  current  estimates  Xv\ 
v  G  A f£,  it  believes  those  are  the  optimal  estimates  of  the  corresponding  node 
variables.  In  that  case,  node  u  can  compute  its  optimal  estimate  by  using  the 
measurements  between  itself  and  its  communication  neighbors.  This  estimation 
problem  is  no  different  from  the  original  BLUE  estimation  problem,  except  that 
it  is  defined  over  the  much  smaller  graph  Qu(  1)  =  (jA(i),  £u(l));  whose  nodes 
include  u  and  its  communication  neighbors: 

■C(l)  =  {«}uV'(l). 

We  call  Qu{  1)  the  1-hop  communication- enabled  subgraph  of  Q  centered  atu.  Since 
u  thinks  that  the  node  variables  of  its  neighbors  are  exactly  known,  all  of  these 
nodes  should  be  understood  as  references;  so  that  Qu{  1)  has  only  one  unknown 
node  variable,  namely,  xu.  Node  u  can  now  compute  an  estimate  of  its  node  vari¬ 
able  by  solving  the  BLU  estimation  problem  associated  with  the  1-hop  subgraph 
Qu(  1),  which  turn  out  to  be  (3.5).  The  Jacobi  algorithm  can  therefore  be  thought 
of  as  an  algorithm  in  which  every  node  solves  a  local  optimal  estimation  problem 
by  assuming  their  neighbors’  estimates  are  correct  and  performing  this  compu¬ 
tation  repeatedly  as  those  estimates  are  updated.  The  name  “Jacobi  algorithm” 
comes  from  the  fact  that  when  communication  is  symmetric,  the  update  equa¬ 
tion  (3.5)  is  essentially  the  Jacobi  method  for  solving  the  linear  equation  (2.6) 
that  defines  the  optimal  estimate  [75]. 
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A  measurement  graph  and  an  associated  symmetric  communication  graph: 


el 


J§> 


G  Gc 

The  1-hop  subgraphs  of  node  4,  with  and  without  considering  the  communica¬ 
tion  graph,  are  shown  below.  Since  communication  is  symmetric,  these  are  the 
same. 


e5 

C§)  3  '(§>  3 

^4(1)  04 (1) 

In  the  subgraph  ^4(1),  the  measurement  model  (2.4)  for  the  only  unknown 
variable  x'4,  when  x2  and  X3  are  taken  as  references,  is 


\%]  =  [0  -/]  [gi+[y]*4 +isi- 


Al  Xr  AT 


X 


The  corresponding  optimal  estimate  (2.6)  when  all  measurement  covariance 
matrices  are  equal  to  the  identity  matrix  is  given  by 


1 


(AbAl)  Ab(  Z  -A^Xr)  =  -(x2~  Z3  +  X3  +  z5) . 


L 


b 


Figure  3.3.  The  Jacobi  Iteration  with  symmetric  communication.  The  iterations 
of  the  Jacobi  algorithm  are  explained  according  to  the  interpretation  that  every 
node  is  repeatedly  solving  a  local  optimal  estimation  problem  that  is  defined  over 
the  1-hop  subgraph  centered  at  itself. 


67 


(continued  from  Figure  3.3)  Since  node  4  can  receive  messages  from  all  of 
its  measurement  neighbors,  the  Jacobi  iteration  for  node  4  is 


~  (i+l)  1  /  *  (i)  |  (i)  |  \ 

^4  =  7;{X2  Z3  +  X3  +  Z5) 

A  similar  construction  based  on  the  1-hop  subgraphs  centered  at  nodes  2 
and  3  leads  to  update  equations  for  estimates  of  rc2  and  X3  given  by 

^2+1>  =  ^  +  C3  +  C4  —  Cl  —  C2), 

4,+1)  =  \^(2]  +  -  C4  -  Cs)- 

The  reference  node,  which  is  node  1,  is  assumed  to  be  at  the  origin,  and 
thus  x\  does  not  appear  in  the  equations. 

Figure  3.4.  Figure  3.3  contd. 


68 


3.3.0. 1  Asynchronous  implementation 


The  algorithm  described  by  (3.15)  is  synchronous,  since  that  description  im¬ 
plicitly  assumes  that  the  iteration  counter  i  is  common  to  all  the  nodes.  This 
means  that  all  nodes  update  their  estimates  at  the  same  time  after  getting  updates 
from  all  of  their  communication  neighbors.  Moreover,  the  description  above  im¬ 
plicitly  assumes  that  there  are  no  communication  faults,  i.e.,  a  node  is  always  able 
to  receive  data  from  all  of  its  communication  neighbors.  In  practice,  nodes  may 
have  varying  processor  power,  so  that  one  may  be  ready  to  start  the  next  iteration 
while  other  nodes  have  not  finished  their  computation.  In  addition,  in  wireless 
networks  with  time  dependent  communication  failures  or  nodes  with  scheduled 
sleep-wake  cycles,  the  communication  graph  may  be  time  varying.  In  both  these 
cases,  waiting  to  get  information  from  all  neighbors  may  not  be  advisable.  In 
this  case,  the  algorithm  can  be  implemented  in  an  asynchronous  fashion,  in  which 
nodes  wait  for  a  “time-out”  period  to  receive  estimates  from  their  neighbors.  If 
estimates  from  some  neighbors  do  not  arrive  in  this  time,  they  use  the  previously 
received  data  from  those  nodes.  Asynchronous  implementation  therefore  requires 
a  local  buffer  to  store  data  that  is  successfully  received  from  neighboring  nodes 
for  use  later  when  communication  fails. 

Consider  time  index  t  €  N  that  is  incremented  by  1  at  the  end  of  every  time¬ 
out  period.  A  communication  edge  ( u ,  v)  is  said  to  fail  in  time  t  if  during  the  time 
between  t  —  1  and  t,  broadcasts  from  node  u  fail  to  reach  v.  A  node  u  is  said  to  fail 
at  time  t  if  during  the  time  between  t  —  1  and  t,  node  u  either  does  not  broadcast 
its  current  estimate  or  does  not  process  any  information.  Such  node  failures  can 
occur  due  to  sleep-scheduling  [54],  among  other  reasons.  We  do  not  consider  other 
forms  of  node  failure  that  is  possible,  such  that  a  failed  node  sends  incorrect  or 
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A  measurement  graph  and  an  associated  asymmetric  communication  graph: 


If  the  measurement  graph  were  symmetric,  The  Jacobi  iteration  for  node  4 
would  have  been  the  same  as  that  shown  in  Figure  3.3.  However,  since  node 
4  now  cannot  receive  messages  from  node  3,  the  Jacobi  iterations  for  node  4 
becomes 

4*+1)  =  ( x  ^  -  z3 ) 

The  iterations  for  nodes  2  and  3  are  the  same  as  the  ones  shown  in  Figure  3.4, 
since  both  of  them  can  receive  messages  from  all  of  their  measurement  neigh¬ 
bors. 


Figure  3.5.  The  Jacobi  Iteration  with  asymmetric  communication. 
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random  data.  We  say  that  a  communication  edge  (u,  v )  is  active  at  time  t  if  in  that 
time,  neither  of  the  nodes  u  and  v  fails  and  the  communication  edge  ( u ,  v )  does 
not  fail.  At  every  time  t,  the  communication  graph  Qc{t)  =  ('P>?  tEc{t))  consists  of 
all  the  nodes  of  Q  and  all  the  communication  edges  that  are  active  in  that  time. 

Although  buffers  can  partially  mitigate  the  effects  of  time  variation  in  the  com¬ 
munication,  certain  difficulties  still  remain.  Since  every  node  u  has  to  keep  and 
update  estimates  of  the  variables  of  its  1-hop  communication-neighbors  (see  (3.5)), 
node  u  has  to  know  its  communication-neighbors  in  advance.  Consider  the  fol¬ 
lowing  situation:  nodes  u  and  v  are  neighbors  in  the  measurement  graph,  but  u 
does  not  receive  any  communication  from  v  for  the  first,  say,  10,  iterations,  and 
then  receives  a  message  from  v  at  the  11th  iteration,  and  never  receives  any  com¬ 
munication  from  v  thereafter.  Should  node  u  initialize  another  local  variable  for 
xv’s  current  estimate  at  the  11th  iteration?  If  so,  what  should  it  do  when  it  sees 
that  all  communication  from  v  has  ceased  thereafter?  What  if  another  neighbor 
appears  at  the  100th  iteration? 

To  avoid  such  difficulties,  we  assume  that  every  node  u  e  ‘V  \  T).  detects 
its  communication-neighbors  during  an  initial  detection  phase,  before  the  itera¬ 
tions  begin.  This  detection,  which  may  be  carried  out  even  during  the  process 
of  obtaining  the  relative  measurements,  leads  to  an  initial  communication  graph 
G?n H  =  (T’,  ££it)  consisting  of  those  communication  edges  that  were  active  over 
the  time  interval  of  the  detection  phase.  A  node  does  not  update  its  communica¬ 
tion  neighborhood  J\f£  thereafter. 
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3.3.1  Correctness  and  performance  analysis  of  the  Jacobi 
algorithm 

To  analyze  the  algorithm,  we  now  define  a  few  matrices.  The  combined  inci¬ 
dence  matrix  Ac  E  Mnxm,  where  m  is  the  number  of  edges  in  the  measurement 
graph  Q,  for  the  pair  of  directed  graphs  (Q,  Qc )  is  dehned  in  the  following  manner: 

{aue  if  e  E  “E,  e  ~  u,  (e  \u,u)  E  “Ec 

(3.7) 

0  otherwise 

where  au^e  is  the  (u,  e)th  entry  of  the  incidence  matrix  A  for  the  measurement 
graph  Q.  Recall  that  for  an  edge  e  that  is  incident  on  a  node  u,  e\u  denotes  the 
other  end  of  u  (see  Section  3.3).  The  incidence  matrix  was  dehned  in  Chapter  2 
and  is  standard  in  algebraic  graph  theory.  The  combined  incidence  matrix  Ac  is 
not  standard  in  graph  theory,  though.  Note  that  AC(Q,QC)  =  A(Q)  if  and  only  if 
the  communication  graph  Qc  is  symmetric. 

The  weighted  in-degree  matrix  T>  E  Uknxkn  of  the  directed  graph  pair  (Q,  Qc ) 
is  dehned  as  a  block-diagonal  matrix  with 

[»]«,«=  E  PP  (3.8) 

eeEg(l) 

Similarly,  the  weighted  adjacency  matrix  C  E  Rfcnxfcn  of  the  directed  graph  pair 
(Q,  Qc )  is  dehned  as 

{Pr1  if  3 e  E  “E  such  that  e  ~  u,  e  ~  v  and  (v,  u)  E  £c, 

(3.9) 

0  otherwise 

Let  M,  N  G  Mfcnf>xfcnf>  be  the  sub-matrices  of  D  and  C,  respectively,  obtained 
by  removing  the  rows  and  columns  corresponding  to  the  reference  nodes,  where 
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rib  =  \V\Vr\  is  the  number  of  nodes  that  do  not  know  their  variables: 

[£>]«,«  if  u  =  v,  u  G  T>\ 

[M]u,„  =  l 

I  0  u  A  v 

[Nk„  =  [e]U)W  for  11,1)6  V  \  Vr. 

Figure  3.6  shows  an  example  of  these  matrices.  Now  dehne 


(3.10) 

(3.11) 


:=  T>  -  e, 

(3.12) 

Lc  :=  M  -  N. 

(3.13) 

It  is  straightforward  to  verify  that 

Lc  =  AltP-'Al 

(3.14) 

where  A%  :=  (g)  Ik  and  is  the  basis  combined  incidence  matrix  of  ( Q ,  Qc ) 

obtained  from  Ac  by  removing  from  it  the  rows  that  correspond  to  the  reference 
nodes.  It  can  be  verified  that  Lc  does  not  depend  on  the  edge  directions  of  Q  but 
does  depend  on  that  in  Qc.  Figure  3.6  shows  an  example  of  all  of  the  matrices 
described  above. 

We  conclude  the  definitions  with  the  following  observation: 

Proposition  3.3.1.  The  matrix  M  defined  in  (3.10)  for  a  graph  pair  ( Q,QC )  is 
positive  definite.  □ 


Proof  of  Proposition  3.3.1.  Assumption  3.1.1  ensures  that  for  every  node  u  G  lS\ 
3fr,  there  is  at  least  measurement  edge  e  G  “E,  e  ~  u  such  that  there  is  an  incoming 
communication  edge  in  Qc  from  the  other  end  of  u,  i.e.,  3(e  \u,u)  G  “Ec.  As  a 
result,  the  set  ‘EO(l)  is  non-empty  for  every  u  G  V  \  Vr.  The  result  now  follows 
from  the  definition  of  D  and  M  in  (3.8)  and  (3.10).  ■ 
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We  start  with  the  synchronous,  time-invariant  case,  in  which  every  node  up¬ 
dates  its  estimate  at  the  same  time  instant,  the  communication  graph  is  fixed  for 
all  time  (i.e.,  there  are  no  node  or  edge  failures).  In  this  case,  the  Jacobi  algo¬ 
rithm  (3.5)  can  be  compactly  expressed  as  the  following  discrete-time  dynamical 
system: 

Mxi+1  =  ttx*  +  bc, 

=>  x(i+1)  =  +  M-1bc,  (3.15) 

where 

bc  :=  Acb(p-\z  —  A^xr).  (3.16) 

The  fixed  point  of  the  iteration  (3.15)  is  given  by  the  solution  of  the  following 
system  of  linear  equations,  when  it  exists: 

£cx°°  =  bc,  (3.17) 

For  x°°  to  exist,  Lc  must  be  invertible.  In  the  next  section  we  describe  the 
conditions  under  which  it  is  so,  and  when  the  Jacobi  algorithm  converges  to  this 
solution.  It  is  important  to  note  here  that  Lc  =  L  if  and  only  if  all  the  inter-node 
communication  is  symmetric.  If  there  is  asymmetry  in  the  communication  graph, 
the  fixed  point  of  the  Jacobi  iteration  will  not  coincide  with  the  BLU  estimates. 
This  will  be  stated  more  precisely  in  the  next  section. 

We  end  this  section  with  the  observation  that  when  the  limiting  estimate  x°° 
exists,  it  is  an  unbiased  estimate  of  the  node  variable  vector  x. 

Lemma  3.3.1.  The  unique  solution  to  Lcx°°  =  bc,  when  it  exists,  is  an  unbiased 
estimate  of  x  and  the  covariance  of  the  estimation  error  e°°  :=  x  —  x°°  is  given 
by 

S  :=  E^e007]  =  LflArbtP-'Af  LfT.  □ 
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Figure  3.6.  The  matrices  used  to  compactly  represent  the  Jacobi  algorithm  with 
asymmetric  communication,  for  the  pair  (Q,  Qc )  shown.  For  simplicity,  we  consid¬ 
ered  the  case  k  —  1  and  every  measurement  error  variance  is  1.  We  use  M,  N,  D,C 
and  Lc  to  represent  M,  N,  D,  C  and  LCl  respectively,  to  emphasize  that  k  —  1. 
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(continued  from  Figure  3.6)  The 

in-degree  and  adjacency  matrices  of  (Q,QC): 

1  2  3 

4 

1  2 

3 

4 

i 

0  0  0 

0 

l 

0  0 

0 

0 

2 

1  0  1 

1 

2 

0  3 

0 

0 

C  = 

D  = 
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and  their  submatrices  obtained  by  removing  the  rows 

and  columns  correspond- 

ing  to  the  reference  nodes: 
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Figure  3.7.  Figure  3.6  contd. 
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Proof  of  Lemma  3.3.1.  From  (3.17),  (3.16),  (3.14)  and  (2.4),  we  get 


x°°  =  L~lAyP~\ z  -  A^r) 

=  4-1^fe2>-1(^rx+  e) 

=  x  +  L^AyP^e, 

where  the  last  equality  follows  from  (3.14).  It  follows  that  E[x°°]  =  x,  so  the 
estimate  is  unbiased,  and  also  that  e°°  =  Lj1AbtP^1e.  Therefore,  the  covariance 
of  the  estimation  error  e°°  is 

EfgOogooT]  =  E  [L-1Al‘B-1eer'p-1AfL-T] 

=  L;1Acbv-1AfL;T, 


which  proves  the  result. 


3. 3. 1.1  Correctness  of  the  Jacobi  algorithm 

First  we  analyze  the  simplest  case  of  symmetric,  time-invariant  communica¬ 
tion. 

Theorem  3.3.1.  Consider  the  Jacobi  algorithm  implemented  on  a  measurement 
and  time- invariant  communication  graph  pair  (Q,  Qc)  that  satisfies  Assumption  3.1.1 
If  the  communication  graph  Qc  is  symmetric,  then  the  synchronous  Jacobi  algo¬ 
rithm  is  correct.  □ 

To  prove  Theorem  3.3.1,  we  will  also  need  the  following  technical  result  from  [77]. 

Proposition  3.3.2  (Lemma  4.2  of  [77]).  Let  X  :=  D  —  N  be  a  square  matrix 
such  that  D  +  D*  >  0  and  Xq  =  D  +  D*  —  ( el6N  +  e~t6N*)  >  0  for  all  9  e  R. 
Then  p(D^1N)  <  1,  where  p(-)  denotes  the  spectral  radius.  □ 
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Proof  of  Theorem  3.3.1.  Since  the  communication  graph  is  symmetric,  Lc  =  L 
and  bc  =  b.  Therefore  L  =  M  —  !N,  so  that  we  can  rewrite  (2.6),  the  equation 
defining  the  BLU  estimate,  as 

Mx*  =  Nx*  +  b 
=>  x*  =  +  M  'b. 

Comparing  the  above  with  (3.15),  we  see  that  the  algorithm’s  error  (defined 
in  (3.2))  evolves  according  to 

e*(m)  =  ^/e*w,  (3.18) 

where 

/  :=  M_1>1  (3.19) 

is  the  Jacobi  iteration  matrix.  Therefore,  to  prove  the  theorem,  we  need  to  show 
that  p(M_1lSf)  <  1.  Since  L  =  M  —  'N  >  0  and  M  >  0,  it  follows  that  M  >  'N  > 
cos  6  Jsf  for  every  8  G  M.  From  Proposition  3.3.2  it  follows  that  p(M_1l\f)  <  1, 
which  proves  the  theorem.  ■ 

To  analyze  the  correctness  and  performance  of  the  algorithm  when  commu¬ 
nication  graph  is  asymmetric  and  possibly  time-varying,  we  make  an  additional 
assumption: 

Assumption  3.3.1  (Diagonal).  Either  of  the  following  two  conditions  hold: 

1.  the  measurement  error  covariance  matrices  are  either  all  equal  to  one  an¬ 
other,  i.e.,  3 Pa  G  Sfc+  such  that  Pe  =  P0,  Ve  G  “E,  or, 

2.  every  measurement  error  covariance  matrix  is  diagonal  (but  not  necessarily 

equal  to  one  another).  □ 
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The  case  of  vector-valued  variables  can  be  reduced  to  the  scalar  valued  case 
as  long  Diagonal  Assumption  3.3.1  is  satisfied,  as  we  now  show. 

If  the  covariance  matrices  are  all  equal,  then  the  covariances  play  no  role  and 
can  be  taken  as  the  identity  matrix.  To  see  why,  let  Pe  =  Pa  G  M,kxk  Ve  G  “E, 
then  P  =  Im  8  P0,  where  8)  denotes  the  Kronecker  product  and  m  is  the  number 
of  edges  in  Q .  Equation  (2.6)  now  simplifies  to 

L  8  P0_1x*  =  (Ab  8  Po1) z, 

where  L  :=  AbA b  is  the  unweighted  Dirichlet  (grounded)  Laplacian  matrix  of  the 
graph  Q.  Simple  algebraic  manipulation  using  rules  of  Kronecker  algebra  show 
that  the  solution  to  this  equation  is  x*  =  {(yL~lAb)®Ik)z,.  Therefore  we  need  only 
to  solve  (Lb  8  /fc)x*  =  (Ab  8  Ik) z  to  get  the  optimal  estimate.  We  can  decompose 
the  above  following  k  systems  of  decoupled  equations: 

L,Xj  =  bj ,  j  =  l,...,k,  (3.20) 

where  Xj  :=  [x \ j , . . . ,  xnhJ]T  G  M"6  is  the  vector  of  the  jth  component  of  the  node 
variables,  bj  =  Abz,j  with  z j  :=  [Cij, . . . ,  Cm,j]T  £  being  the  vector  of  the  jth 
components  of  the  entries  in  z.  When  the  covariance  matrices  are  diagonal,  it 
is  again  straightforward  to  show  that  the  estimation  problem  for  each  of  the  k 
components  of  every  node  variable  is  decoupled.  In  light  of  the  discussion  above, 
when  the  Diagonal  Assumption  3.3.1  holds,  we  only  need  to  consider  the  case  when 
the  variables  and  measurements  are  scalar-valued. 

In  the  sequel,  to  emphasize  that  we  are  proving  results  only  for  the  scalar 
valued  variables,  we  denote  the  matrices  Lc,  M,  and  AT  in  the  special  case  k  —  1 
by  Lc,  M ,  and  N.  One  of  the  main  advantages  of  considering  the  scalar  case  of 
k  =  1  is  that  Lc  turns  out  to  be  an  M- matrix  [78,  Chapter  6].  M-matrices  are  a 
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special  class  of  matrices  that  have  been  widely  studied  in  the  context  of  iterative 
methods  of  solving  linear  equations  since  they  arise  naturally  in  numerical  solution 
of  PDEs,  in  convergence  analysis  of  matrix-iteration  processes,  and  they  possess 
a  number  of  useful  properties  that  help  establish  convergence  of  asynchronous 
parallel  iterative  methods  [78,  79].  Non-singular  M-matrices  are  termed  matrices 
of  class  K  by  Fiedler  and  Ptak  [80]. 

To  define  M-matrices,  we  start  with  matrices  of  class  Z,  which  are  square 
matrices  with  non-positive  off-diagonal  entries.  A  matrix  X  of  class  Z  is  an  M- 
matrix  if  it  can  be  written  as  X  =  si  —  B  where  B  y  0  and  s  >  p(B).  Here  and 
in  the  sequel,  >-  (V)  is  used  to  denote  entry-wise  ordering.  That  is,  for  a  matrix 
or  a  vector  X ,  X  y  (^)O  means  every  entry  of  X  is  positive  (non- negative). 
For  more  information  on  M-matrices,  the  reader  is  referred  to  Chapter  6  of  [78], 
where  50  equivalent  characterizations  of  M-matrices  are  provided.  We  start  with 
the  following  technical  result: 

Lemma  3.3.2.  Consider  the  matrices  LC,M,N  e  MnfeXn6  (he.,  the  matrices 
Lc,  M,  AT  for  the  case  k  =  1)  defined  in  (3.12),  (3.10)  ,  and  (3.11)  for  a  mea¬ 
surement  and  time-invariant  communication  graph  pair  (Q,  Qc )  that  satisfies  As¬ 
sumption  3.1.1.  The  matrix  Lc  defined  for  ( Q,QC )  is  an  M-matrix.  Moreover,  the 
following  statements  are  equivalent: 

1.  The  matrix  Lc  is  a  non-singular  M-matrix. 

2.  p(J)  <  1,  where  J  :=  M~lN  is  the  Jacobi  iteration  matrix  (for  k  =  1)  and 
p(-)  denotes  the  spectral  radius. 

3.  For  every  node  u  G  V  \  Vr ,  there  is  a  directed  path  in  the  communication 

graph  Qc  from  at  least  one  of  the  reference  nodes  to  u.  □ 
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A  directed  path  V  from  a  node  p\  to  another  node  prn  in  a  graph  Q  is  an 
alternating  sequence  of  finite  number  of  nodes  and  edges  that  start  with  p\  and 
end  with  pm : 

v  =  {pi,e1,p2,e2,...,pi,ehpi+i  ')•••')  Cm— li  Pm} 

such  that  every  edge  e;  in  the  path  is  directed  from  the  previous  node  to  the 
next  node  in  the  path:  ei  =  (pi,pi+ 1).  Note  that  edge  directions  matter  in  this 
definition. 

The  proofs  of  lemma  3.3.2  and  the  theorem  below,  which  describes  the  behavior 
of  the  Jacobi  algorithm  with  (possibly)  asymmetric  communication,  are  provided 
in  Section  3.7. 

Theorem  3.3.2.  Consider  the  synchronous  Jacobi  algorithm  implemented  on  a 
measurement  graph  Q  and  its  associated  time-invariant  communication  graph  Qc 
such  that  Q  and  Qc  satisfy  Assumptions  3.1.1.  Furthermore,  assume  that  Diago¬ 
nal  Assumption  3.3.1  is  satisfied.  The  Jacobi  algorithm  converges  to  the  unique 
solution  of  Lck°°  =  bc,  if  and  only  if,  for  every  u  e  \  there  is  a  directed 
path  in  Qc  from  at  least  one  of  the  reference  nodes  to  u.  □ 

Proof  of  Theorem  3.3.2.  We  only  consider  the  case  k  —  1  in  the  proof.  The  case 
for  k  >  1  will  follow  from  Diagonal  Assumption  3.3.1,  as  explained  earlier. 

Define  the  error  at  the  ith  iteration  as  the  difference  between  the  current  estimate 
and  the  limiting  estimate: 

e(i)  .=  £(*)  _ 

We  rewrite  (3.17)  as 

Mx°°  =  Nx°°  +  bc 
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It  follows  from  (3.15)  that  the  error  evolves  according  to  : 

e(*+1)  =  M~lNe «  (3.21) 

Clearly,  the  Jacobi  algorithm  converges  (i.e.,  the  error  — >  0  as  i  — >  oo)  if  and 

only  if  p{M~lN)  <  1.  It  follows  from  Lemma  3.3.2  that  this  condition  is  satisfied 
if  and  only  if,  for  every  u  G  JJ  \  JJr,  there  is  a  directed  path  in  Qc  from  at  least 
one  reference  nodes  to  u,  which  proves  the  theorem.  ■ 

The  next  theorem  states  how  the  algorithm  behaves  in  the  asynchronous  mode, 
i.e.,  when  there  are  temporary  node  and  communication  edge  failures,  and  nodes 
update  their  estimates  in  an  asynchronous  manner  as  described  in  Section  3.3.0. 1. 
Apart  from  deterministic  failures,  we  consider  the  following  model  of  random 
node  and  communication  edge  failures.  At  every  time  instant  t  6  N,  every  com¬ 
munication  edge  can  fail  independently  of  all  other  links  with  probability  p,  and 
every  node  can  fail  independently  of  all  other  nodes,  with  probability  q,  where 
p  <  l,q  <  1.  This  model  of  failure  is  referred  to  as  i.i.d.  failure.  The  proof  the 
result  is  provided  in  Section  3.7. 

Theorem  3.3.3.  Consider  the  asynchronous  Jacobi  algorithm  implemented  on  the 
measurement  graph  Q  and  its  associated  time-varying  communication  graph  Qc{t), 
such  that  Q  and  Qc(t )  satisfy  Assumption  3.1.1  at  every  t  e  N.  Let  denote  the 
initial  measurement  graph  that  describes  the  neighbor  relations  used  by  the  nodes 
to  implement  the  algorithm,  and  let  Lc  and  hc  be  as  defined  in  (3.12)  and  (3.16) 
for  the  pair  (Q,  ) .  Furthermore,  assume  that  Diagonal  Assumption  3.3.1  holds. 

Then,  the  Jacobi  algorithm  converges  to  the  unique  solution  of  Acx  =  bc  if  and 
only  if 

1.  for  every  node  u  G  there  is  a  directed  path  in  tynit  from  at  least  one 

reference  node  to  that  node,  and 
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2.  no  communication  edge  in  Q^nit  fails  permanently,  and  no  communication 
edge  that  is  not  in  Q?nit  remains  active  infinitely  often,  i.e., 

OO  OO 

nu6cw=eto«-  <3'22> 

i= i  t=t 

When  nodes  and  communication  edges  fail  according  to  the  i.i.d.  failure  model,  if 
condition  1  above  is  satisfied,  then  the  Jacobi  algorithm  converges  to  the  unique 
solution  of  Lcx  =  bc  almost  surely.  □ 

Cautionary  remark:  All  of  the  convergence  results  of  the  Jacobi  algorithm  in 
this  dissertation,  either  with  asynchronous  iteration  or  with  asymmetric  communi¬ 
cation,  have  been  established  for  the  special  cases  when  Diagonal  Assumption  3.3.1 
is  satisfied.  A  general  convergence  result  is  still  an  open  problem.  However,  in 
the  simulations  described  in  Section  3.3.3,  this  assumption  is  violated  but  the 
algorithm  is  seen  to  converge.  Such  numerical  evidence  suggests  that  Diagonal 
Assumption  3.3.1  is  perhaps  required  due  to  our  proof  technique,  but  can  probably 
be  relaxed.  □ 

The  following  corollary  about  the  correctness  of  the  Jacobi  algorithm  follows 
immediately. 

Corollary  3.3.1.  When  Assumption  3.1.1  and  the  Diagonal  Assumption  3.3.1 
hold,  the  Jacobi  algorithm  implemented  on  a  measurement  graph  Q  and  its  associ¬ 
ated  time-varying  communication  graph  Qc(t )  is  correct  if  and  only  if  the  following 
conditions  hold: 

1.  The  initial  communication  graph  fqnit  is  symmetric. 

2.  No  communication  edge  fails  permanently  and  no  communication  edge  that 
is  not  in  <Jfnit  remains  active  infinitely  often. 
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When  the  nodes  and  communication  edges  fail  according  to  the  i.i.d.  failure  model, 
and  the  first  condition  above  is  satisfied,  then  the  Jacobi  algorithm  converges  to 
the  optimal  estimates  a.s.  □ 

Note  that  the  condition  of  there  being  directed  paths  from  reference  nodes  to 
the  other  nodes  in  Theorem  3.3.3  is  automatically  satisfied  by  Qfnit  being  symmet¬ 
ric  (see  Assumption  3.1.1  to  see  why). 

3. 3. 1.2  Convergence  rate  of  the  Jacobi  algorithm 

For  establishing  the  convergence  rate  of  Jacobi  algorithm,  we  restrict  our  at¬ 
tention  to  the  special  case  of  symmetric  communication  without  any  communi¬ 
cation  faults,  with  the  node  variables  being  scalars  (i.e.,  k  —  1).  As  explained 
in  Section  3.3. 1.1,  when  Diagonal  Assumption  3.3.1  is  satisfied,  the  general  case 
of  vector-valued  variables  can  be  analyzed  in  terms  of  the  scalar  case.  It  follows 
from  (3.18)  that  convergence  rate  of  the  Jacobi  algorithm  will  depend  on  the 
spectrum  of  the  Jacobi  iteration  matrix  J. 

We  will  need  the  following  notation  in  presenting  the  convergence  rate  result. 
Let  Ami n(-h)  denote  the  minimum  eigenvalue  of  the  matrix  L,  and  dmax(P),dmin(P) 
denote  the  maximum  and  minimum  weighted  degrees  of  the  graph  Q,  i.e.,  dmax(P )  :  = 
ma  Xj  Mjj  and  dmin(P)  :=  min  j  Mjj.  The  dependence  of  dmax,dm  in  on  P  is  used 
to  emphasize  that  the  edge  weights  are  the  inverse-variances  which  are  specified 
by  a  function  P  :  “E  — >  (0,  oo).  The  proof  is  provided  in  Section  3.7. 

Theorem  3.3.4.  Consider  a  pair  of  measurement  and  time- invariant  commu¬ 
nication  graphs  (G,GC)  that  satisfies  Assumption  3.1.1.  Assume  that  k  =  1,  Gc 
is  symmetric,  and  the  Jacobi  algorithm  is  implemented  in  a  synchronous  man¬ 
ner.  For  every  0  <  e  <  1,  the  number  of  iterations  niter(e)  required  so  that 
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<  e,  Vi  >  niter(e),  satisfies 


[■WP),!l0g,ll  <  □ 

The  advantage  of  the  result  above  is  that  for  a  large  class  of  graphs  that  are 
relevant  to  ad-hoc  sensor  networks,  asymptotic  bounds  on  Amin(L)  can  be  obtained 
even  without  complete  knowledge  of  the  graph.  We  will  obtain  in  Chapter  6  one 
such  bound  in  terms  of  “effective  resistances”  in  the  graph.  In  addition,  the 
ratios  Am; n(L)/dmax,  Amin(L)/dmin  are  closely  related  to  the  well-known  algebraic 
connectivity  of  the  unweighted  graph  Q,  for  which  an  extensive  literature  exists  [81, 
82], 

3.3.2  Reducing  error  faster  -  flagged  initialization 

The  preceding  discussion  shows  that  in  measurement  graphs  with  low  algebraic 
connectivity,  the  Jacobi  algorithm  will  take  a  large  number  of  iterations  before  the 
error  ratio  eW  becomes  sufficiently  small.  Since  large  ad-hoc  and  sensor  networks 
are  expected  to  have  low  algebraic  connectivity,  one  can  expect  that  the  number 
of  iterations  that  the  Jacobi  algorithm  takes  before  the  error  w.r.t.  to  the  optimal 
estimate,  ||e*^||,  is  lower  than  a  pre-specified  value  will  be,  in  general,  quite  large. 

There  are  two  ways  to  reduce  the  error  ||e*W||  :  employ  an  algorithm  with 
a  faster  convergence  rate  compared  to  Jacobi,  and  initialize  the  iterations  with 
more  accurate  initial  estimates.  Devising  a  faster  algorithm  is  postponed  till 
Section  3.4.  In  this  section  we  show  how  to  reduce  the  error  w.r.t.  to  the  optimal 
estimate  of  the  Jacobi  algorithm  by  cleverly  initializing  the  initial  estimates.  This 
scheme,  called  flagged  initialization ,  does  not  require  extra  communication  or 
expensive  computation,  and  is  also  applicable  to  other  algorithms.  Indeed,  the 
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flagged  initialization  will  be  used  with  the  OSE  algorithm  to  reduce  its  error. 


After  the  deployment  of  the  network,  the  reference  nodes  initialize  their  esti¬ 
mates  to  their  known  values,  but  all  other  nodes  initialize  their  estimates  to  oo, 
which  serves  as  a  flag  to  declare  that  these  nodes  do  not  have  a  good  estimate 
of  their  variables.  Subsequently,  in  its  estimate  updates,  each  node  includes  in 
its  1-hop  subgraph  only  those  nodes  that  have  finite  estimates.  If  none  of  their 
neighbors  has  a  finite  estimate,  then  the  node  keeps  its  estimate  at  oo.  In  the  be¬ 
ginning,  only  the  references  have  a  finite  estimate.  In  the  first  iteration,  the  1-hop 
neighbors  of  the  references  can  compute  finite  estimates,  whereas  in  the  second 
iteration,  the  2-hop  neighbors  of  the  references  can  also  obtain  finite  estimates 
and  so  forth  until  all  nodes  have  finite  estimates.  Flagged  initialization  affects 
only  the  initial  stage  of  the  algorithm,  and  thus  does  not  affect  its  correctness  and 
the  rate  at  which  the  error  ratio  is  reduced  with  iteration  number  i. 

3.3.3  Simulations  of  the  Jacobi  algorithm 

We  present  a  few  numerical  simulations  to  study  the  behavior  of  the  Jacobi 
algorithm.  First  we  simulate  the  algorithm  with  symmetric  communication  and 
synchronous  operation. 

In  these  simulations  the  node  variables  represent  the  physical  position  of  sen¬ 
sors  in  the  plane.  All  simulations  refer  to  a  network  with  200  nodes  that  are  ran¬ 
domly  placed  in  the  unit  square  (see  Figure  3.8).  Node  1,  placed  at  the  origin,  is 
chosen  as  the  single  reference  node.  Pairs  of  nodes  separated  by  a  distance  smaller 
than  rmax  :=  0.11  are  allowed  to  have  noisy  measurements  of  each  others’  relative 
range  and  bearing  (see  Figure  2.1).  The  range  measurements  are  corrupted  with 
zero-mean  additive  Gaussian  noise  with  standard  deviation  ar  =  0.15  rmax,  and 
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the  angle  measurements  are  corrupted  with  zero-mean  additive  Gaussian  noise 
with  standard  deviation  erg  =  10  deg.  Assuming  that  the  range  and  bearing 
measurement  errors  are  independent  and  have  variances  independent  of  distance, 
consider  a  noisy  measurement  (r,  9)  of  true  range  and  angle  (ro,0o).  Then  it  can 
be  shown  that  the  covariance  matrix  of  the  measurement  (U:V  =  [rcos#,  r  sin  6} T 
is  given  approximately  by 


p  = 

1  U.V 


yoCTg  +  cr;  cos2  60 


-XoVoVg  +  "rf  sin(26>0 


-XoVoCrj  +  \  sin(2d0)  x20a2e  +  a2  sin2  60 


(3.23) 


where  xQ  =  r0cos60  and  yQ  =  r0sind0.  Assuming  that  the  scalars  ar,(rg  are 
provided  a  priori  to  the  nodes,  a  node  can  estimate  this  covariance  by  using  the 
measured  r  and  6  in  place  of  their  unknown  true  values.  Since  the  covariances 
are  not  diagonal  and  since  distinct  measurements  have  distinct  covariances,  this 
example  does  not  satisfy  the  assumptions  for  which  the  OSE  algorithm  is  guar¬ 
anteed  to  converge.  The  locations  estimated  by  the  centralized  optimal  estimator 
are  shown  in  Figure  3.8,  together  with  the  true  locations. 


In  reporting  simulation  results,  we  plot  the  normalized  error  vs.  iteration 
number,  where  the  normalized  error  is  defined  as 

l|x(i)-x*ll 

II ■  3-24 

I  X  1 1 

Recall  that  is  the  vector  of  estimates  at  the  ith  iteration  and  x*  is  the  optimal 
estimate. 


Figure  3.9(a)  compares  the  normalized  error  as  a  function  of  iteration  number 
for  the  Jacobi  algorithm,  with  and  without  flagged  initialization.  The  straight  lines 
in  the  log-scaled  graph  indicate  the  exponential  convergence  of  the  algorithm.  The 
figure  shows  the  dramatic  improvement  achieved  with  the  flagged  initialization 
scheme.  With  flagged  initialization,  the  Jacobi  algorithm  can  estimate  the  node 
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Figure  3.8.  A  measurement  graph  created  by  an  ad-hoc  sensor  network  with  200 
nodes  distributed  randomly  in  a  unit  square  area.  The  edges  of  the  measure¬ 
ment  graph  are  shown  as  line  segments  connecting  the  node  positions,  which  are 
shown  as  black  dots.  Two  nodes  with  an  edge  between  them  are  provided  with  a 
measurement  of  their  relative  positions  in  the  plane.  The  red  squares  are  the  po¬ 
sitions  estimated  by  the  (centralized)  optimal  estimator.  A  single  reference  node 
is  located  at  the  origin. 

positions  within  5%  of  the  optimal  estimate  after  only  9  iterations. 

Figure  3.9(b)  shows  the  performance  of  the  Jacobi  algorithm  with  flagged 
initialization  under  i.i.d.  communication  link  failure.  Every  symmetric  communi¬ 
cation  link  is  allowed  to  fail  (independently  of  all  other  links)  with  probability  p f 
at  every  iteration,  and  no  node  is  allowed  to  fail.  Therefore,  in  the  terminology  of 
Theorem  3.3.3,  p  —  Pf,  q  =  0.  Note  that  although  the  communication  graph  Qinit 
is  symmetric,  the  communication  failures  need  not  be.  A  communication  edge 
(u,v)  can  fail  independently  of  the  communication  edge  (v,u).  Three  values  of 
the  failure  probabilities  are  tested.  Not  surprisingly,  higher  failure  rates  result  in 
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slower  convergence.  Note  that  Theorem  3.3.3  guarantees  the  convergence  of  the 
error  to  0  is  guaranteed  with  link  and  node  failure  only  when  Diagonal  Assump¬ 
tion  3.3.1  is  satisfied.  In  these  simulations  the  assumption  was  not  satisfied  (see 
the  covariances  in  (3.23)),  still  the  algorithm  is  seen  to  converge  to  the  optimal 
estimates. 


(a)  Without  and  with  flagged  initialization.  (b)  Effect  of  communication  link  failure. 

Figure  3.9.  Simulation  results  (normalized  error  l|Xy  vs.  iteration  number)  of 
Jacobi  algorithm  with  symmetric  communication,  (a)  With  and  without  flagged 
initialization.  “0  init.”  means  all  the  node  estimates  were  initialized  to  0.  The 
communication  graph  is  symmetric  and  time-invariant  (no  faults),  (b)  Effect 
of  communication- link  failures.  All  the  simulations  in  case  (b)  are  carried  out 
with  flagged  initialization.  Note  that  although  the  communication  graph  Qm\t  is 
symmetric,  the  communication  failures  need  not  be.  A  communication  edge  (u,  v) 
can  fail  independently  of  the  communication  edge  (v,u). 

Now  we  present  numerical  evidence  that  the  Jacobi  algorithm  indeed  converges 
to  a  non-optimal  estimate  when  communication  is  asymmetric.  The  Jacobi  algo¬ 
rithm  is  simulated  for  the  measurement  and  communication  graph  pair  shown  in 
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Figure  3.1.  Simulation  results  are  shown  in  Figure  3.10.  At  every  iteration  of  the 
simulation,  every  communication  edge  was  allowed  to  fail  with  a  probability  of 
0.2,  independent  of  all  other  edges,  i.e. ,  p  =  0.2  and  q  =  0.  The  Figure  validates 
the  predictions  of  Theorem  3.3.3  and  Lemma  3.3.1:  the  estimate  converges  to  the 
predicted  value  x°°  but  not  to  the  optimal  estimate  x*. 


Figure  3.10.  Simulation  results  on  the  convergence  of  the  Jacobi  algorithm  with 
asymmetric  communication.  The  simulation  was  conducted  for  the  measurement 
graph  Qi  and  communication  graph  Q\  shown  in  Figure  3.1.  Each  communication 
edge  was  allowed  fail  at  each  iteration,  with  a  probability  of  0.2,  independent  of  all 
other  edges.  The  algorithm  converges  to  an  unbiased  estimate  x°°  whose  variance 
is  larger  than  the  that  of  the  BLU  estimate. 


3.4  The  overlapping  subgraph  estimator(OSE) 
algorithm 

In  this  section  we  describe  a  distributed  algorithm  to  compute  the  optimal 
estimates  that  has  a  faster  convergence  rate  compared  to  the  Jacobi  algorithm. 
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In  spite  of  the  advantages  of  the  Jacobi  algorithm  discussed  above,  such  as  scala¬ 
bility,  convergence,  correctness  under  mild  assumptions,  robustness  to  temporary 
failures,  it  has  a  significant  weakness,  namely,  its  slow  convergence  rate  (see  the 
discussion  following  Theorem  3.3.4). 

It  may  be  possible  to  improve  the  convergence  rate  by  using  other  iterative 
techniques  such  as  Gauss  -  Siedel,  SOR  or  the  conjugate  gradient  [75]  methods, 
or  even  by  preconditioning,  but  any  such  improvement  will  come  at  the  cost  of 
increased  communication.  In  ad-hoc  wireless  networks,  the  primary  source  of  en¬ 
ergy  consumption  is  communication  [10],  while  much  less  energy  is  consumed  for 
computation  [11],  Therefore  the  challenge  is  to  devise  an  algorithm  that  achieves 
faster  convergence  compared  to  the  Jacobi  algorithm  with  no,  or  minimal,  in¬ 
crease  in  communication.  The  overlapping  subgraph  estimator  (OSE)  algorithm 
described  in  this  section  achieves  these  objectives.  It  also  retains  the  scalability 
and  robustness  properties  of  the  Jacobi  algorithm. 

We  will  consider  only  the  symmetric  communication  case  in  describing  the 
OSE  algorithm.  All  the  analysis  and  simulation  of  the  OSE  algorithm  will  be  done 
under  the  assumption  that  inter-node  communication  is  symmetric.  A  general 
analysis  of  the  algorithm  with  asymmetric  case  is  a  subject  of  future  research. 

The  OSE  algorithm  can  be  thought  of  as  an  extension  of  the  Jacobi  algorithm, 
in  which  individual  nodes  utilize  larger  subgraphs  to  improve  their  estimates.  To 
understand  how,  suppose  that  each  node  broadcasts  to  its  1-hop  neighbors  not 
only  its  current  estimate,  but  also  all  of  the  latest  estimates  that  it  received  from 
its  1-hop  neighbors.  Note  that  since  we  have  assumed  symmetric  communica¬ 
tion  between  nodes,  the  1-hop  communication  neighbors  and  1-hop  measurement 
neighbors  are  identical,  which  we  refer  to  as  the  1-hop  neighbors.  In  the  absence 
of  drops,  at  the  ith  iteration  step  each  node  u  has  the  estimates  Xv  !  for  its  1-hop 


91 


neighbors  v  G  A/"u(l)as  well  as  the  (older)  estimates  Xv  ^  for  its  2-hop  neighbors 

nG  VU{2)\NU{1). 

The  reason  that  we  don’t  attempt  a  complete  analysis  of  the  OSE  algorithm 
in  the  presence  of  asymmetric  communication,  apart  from  the  mathematical  diffi¬ 
culty  in  carrying  out  such  an  analysis,  is  that  -  and  it  will  be  shown  in  Section  3.5 
-  it  is  impossible  to  ameliorate  some  of  the  detrimental  effects  of  asymmetric 
communication  by  using  the  OSE  algorithm  in  place  of  Jacobi. 

Under  the  information  exchange  scheme  described  above,  at  the  ith  iteration 
each  node  u  has  estimates  of  all  of  the  node  variables  of  the  nodes  in  the  set 
*7^(2)  consisting  of  all  of  its  1-hop  and  2-hop  measurement  neighbors.  In  the 
OSE  algorithm,  each  node  updates  its  estimate  using  the  2-hop  subgraph  Qu(2)  = 
(Jdu(2),  £u{2))  of  Q  centered  at  u,  with  edge  set  £„(2)  consisting  of  all  of  the  edges 
of  the  measurement  graph  Q  that  connect  elements  of  T4(2).  For  this  estimation 
problem,  node  u  takes  as  references  the  node  variables  of  its  2-hop  neighbors.  The 
gain  in  convergence  speed  with  respect  to  the  Jacobi  algorithm  comes  from  the 
fact  that  the  2-hop  subgraph  Qu{ 2)  contains  more  edges  than  the  1-hop  subgraph 
Qu(  1)-  The  OSE  algorithm  can  be  summarized  as  follows: 

1.  Each  node  u  G  T'’  picks  an  arbitrary  initial  estimate  xi-1'1  of  the  node 
variable  xv  of  each  of  its  2-hop  neighbors  v  G  T4(2)  \  ^(1).  These  estimates 
need  not  be  consistent  across  different  nodes. 

2.  At  the  ith  iteration,  each  node  u  G  V  assumes  that  the  estimates  xf  2 
of  the  node  variables  xv  of  its  2-hop  neighbors  that  it  received  through  its 
1-hop  neighbors  are  correct  and  solves  the  corresponding  optimal  estimation 
problem  associated  with  the  2- hop  subgraph  Qu{ 2).  In  particular,  each  node 
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u  solves  the  linear  equations  Lu^yu  =  bu,  where  yu  is  a  vector  of  node  vari¬ 
ables  that  correspond  to  the  nodes  in  its  1-hop  subgraph  Qu(  1),  and  LUy 2,  b„ 
are  dehned  for  the  subgraph  Gu( 2)  as  X,  b  are  for  Q  in  (2.6).  After  this  com¬ 
putation,  node  u  updates  its  estimate  as  xi +1')  <—  A yu  +  (1  —  A )xu\  where 
0  <  A  <  1  is  a  pre-specihed  design  parameter  and  yu  is  the  variable  in  yu 
that  corresponds  to  xu.  The  new  estimate  Xu+1^  as  well  as  the  estimates 
previously  received  from  its  1-hop  neighbors  v  E  *X4(  1)  are  then  broadcast 
to  all  of  its  1-hop  neighbors. 

3.  At  the  end  of  the  ith  iteration,  each  node  u  then  listens  for  the  broadcasts 
from  its  1-hop  neighbors  and  uses  them  to  update  its  estimates  for  the  node 
variables  of  all  of  its  2-hop  neighbors.  Once  all  updates  are  received  a  new 
iteration  can  start. 

As  in  the  case  of  the  Jacobi  algorithm,  the  termination  criteria  vary  depending  on 
the  application,  and  nodes  use  measurements  and  covariances  obtained  initially  for 
all  future  time.  Figure  3.11  shows  a  2-hop  subgraph  used  by  the  OSE  algorithm. 


Figure  3.11.  (a)  A  measurement  graph  Q  with  node  1  as  reference,  and  (b)  a  2-hop 
subgraph  Q^(2)  centered  at  node  4.  While  running  the  OSE  algorithm,  node  4 
treats  nodes  1,5,  and  2  as  reference  nodes  in  the  subgraph  £4(2)  and  solves  for 
the  unknowns  x3,x4,  and  xe. 
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The  previous  description  assumes  that  communication  is  synchronous  and  that 
each  node  receives  broadcasts  from  all  of  its  neighbors.  However,  as  in  the  Jacobi 
algorithm,  the  OSE  algorithm  can  be  implemented  in  an  asynchronous  manner 
to  make  the  algorithm  robust  to  imperfect  synchronization  and  link  failures.  A 
timeout  mechanism  can  be  used  for  this  purpose,  in  which  each  node  resets  a  timer 
as  it  broadcasts  its  most  recent  estimates.  When  this  timer  reaches  a  pre-specified 
timeout  value,  the  node  initiates  a  new  iteration,  regardless  of  whether  or  not  it 
received  messages  from  all  of  its  1-hop  neighbors.  If  a  message  is  not  received 
from  one  of  its  neighbors,  the  node  uses  the  data  most  recently  received  from  that 
neighbor  for  the  next  iteration. 

Remark  3.4.1.  /i-hop  OSE  One  can  also  design  a  h.-hop  OSE  algorithm  by  letting 
every  node  utilize  a  /?,-hop  subgraph  centered  at  itself,  where  h  is  an  (small)  integer. 
The  resulting  algorithm  is  a  straightforward  extension  of  the  2-hop  OSE  just 
described,  except  that  at  every  iteration,  individual  nodes  have  to  transmit  to  their 
neighbors  larger  amounts  of  data  than  in  2-hop  OSE,  potentially  requiring  multiple 
packet  transmissions  at  each  iteration.  In  practice,  this  added  communication  cost 
limits  the  allowable  value  of  h.  □ 

The  next  result  establishes  the  correctness  of  the  OSE  algorithm. 

Theorem  3.4.1.  Imagine  a  pair  of  measurement  and  initial  communication  graphs 
(G,Ginit)  satisfying  assumption  3.1.1.  When  the  communication  graph  Gc{t)  is 
symmetric  at  every  t  G  N,  and  Diagonal  Assumption  3.3.1  is  satisfied,  and  no 
node  or  communication  edge  fails  permanently,  i.e.,  a=i  u=^cw  =  gu,  then 
the  OSE  algorithm  converges  to  the  optimal  estimate.  □ 

Proof  of  Theorem  3.4.1.  In  the  special  case  A  =  1,  the  OSE  algorithm  becomes 
the  same  as  the  Asynchronous  Weighted  Additive  Schwarz  (AWAS)  method  [83]. 
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In  that  case,  Theorem  3.1  in  [83]  states  that  if  L_1  ^  0  and  some  weak  regularity 
condition  holds,  then  the  AWAS  method  converges  for  every  initial  condition. 
Under  the  assumptions,  it  follows  from  Lemma  3.3.2  that  L  is  a  non-singular  M- 
matrix,  and  therefore  L~l  >-  0  [78].  The  splitting  L  —  M  —  N  is  called  weak 
regular  if  M-1  >:  0  and  M_1N  >z  0,  which  is  satisfied  in  our  case.  The  regularity 
condition  required  in  [83]  is  actually  not  on  the  splitting  M  —  N  but  on  a  number 
of  splittings  that  every  node  can  be  thought  of  as  applying  in  its  local  processor. 
We  refrain  from  repeating  the  tedious  details,  but  it  is  straightforward  to  check 
that  the  OSE  algorithm  satisfies  the  weak  regularity  conditions  in  Theorem  3.1 
of  [83].  This  guarantees  convergence  of  the  AWAS  method,  and  therefore  of  the 
OSE  algorithm.  When  A  <  1,  the  proof  technique  of  Theorem  3.1  in  [83]  can 
be  adapted  to  prove  again  that  the  OSE  algorithm  converges.  Since  the  proof  is 
extremely  long  and  tedious,  yet  only  a  minor  generalization  of  the  results  in  [83], 
it  is  not  provided  here.  The  complete  proof  is,  however,  available  in  [66]. 

Note  that  the  inverse-positivity  of  M-matrices,  and  specifically  of  L  was  used 
above,  but  not  in  proving  convergence  of  the  Jacobi  algorithm.  As  in  case  of 
the  Jacobi  algorithm,  the  Assumption  3.3.1  can  probably  be  relaxed  for  the  OSE 
algorithm’s  correctness.  In  the  simulations  described  below,  this  assumption  is 
violated  but  the  algorithm  is  seen  to  converge. 

3.4.0. 1  Modified  EPA 

The  Embedded  Polygon  Algorithm  (EPA)  proposed  in  [69]  can  be  used  for 
iteratively  solving  (2.6);  since  it  is  essentially  a  block  -  Jacobi  method  of  solv¬ 
ing  a  system  of  linear  equations,  where  the  blocks  correspond  to  non-overlapping 
polygons.  The  special  case  when  the  polygons  are  triangles  has  been  extensively 
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studied  in  [69].  We  will  not  include  here  the  details  of  the  algorithm,  including 
triangle  formation  in  the  initial  phase,  the  intermediate  computation,  communi¬ 
cation  and  update.  The  interested  reader  is  referred  to  [69].  It  is  not  difficult 
to  adapt  the  algorithm  in  [69]  to  the  problem  considered  here.  We  have  imple¬ 
mented  the  modified  EPA  algorithm  (with  triangles  as  the  embedded  polygons) 
and  compared  it  with  both  Jacobi  and  OSE.  Results  are  presented  in  section  3.4.2. 

3.4.1  Simulations  of  the  OSE  algorithm 

In  this  section,  we  present  numerical  simulations  to  illustrate  the  performance 
of  the  OSE  algorithm,  and  compare  its  convergence  rate  with  that  of  Jacobi 
numerically.  All  the  simulations  with  the  OSE  algorithm  are  done  for  the  mea¬ 
surement  graph  shown  in  Figure  3.8.  The  construction  of  the  measurement  graph 
along  with  the  measurements  and  their  associated  error  covariances  are  described 
in  Section  3.3.3. 

Figure  3.12(a)  compares  the  normalized  error  as  a  function  of  iteration  number 
for  the  three  algorithms  discussed  in  this  paper  -  Jacobi,  EPA  and  the  OSE.  Two 
versions  of  OSE  were  tested,  2-hop  and  3-hop.  It  is  clear  from  this  figure  that 
the  OSE  outperforms  both  Jacobi  and  modified  EPA.  As  the  figure  shows,  drastic 
improvement  was  achieved  with  the  flagged  initialization  scheme.  With  flagged 
initialization,  the  2-hop  OSE  algorithm  can  estimate  the  node  positions  within 
3%  of  the  optimal  estimate  after  only  9  iterations.  For  the  flagged  OSE,  the 
normalized  error  is  not  defined  till  iteration  number  8,  since  some  nodes  had  no 
estimate  of  their  positions  till  that  time.  Figure  3.12(b)  shows  the  performance  of 
the  2-hop  OSE  algorithm  with  flagged  initialization  under  two  different  link-failure 
probabilities.  Not  surprisingly,  higher  failure  rates  result  in  slower  convergence. 
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(a)  Jacobi  and  OSE.  (b)  2-hop  OSE  with  link  failures. 

Figure  3.12.  (a)  Performance  comparison  between  the  Jacobi  algorithm  and  the 
overlapping  subgraph  estimator  (OSE)  algorithm  without  link  failures.  The  nor¬ 
malized  error  is  defined  as  where  is  the  vector  of  estimates  at  the  i-th 

iteration  and  x*  is  the  optimal  estimate.  Except  for  the  case  with  flagged  initial¬ 
ization,  all  of  the  simulations  are  run  with  all  initial  estimates  of  node  variables 
set  to  0.  For  the  flagged  OSE,  the  normalized  error  can  be  defined  only  after  iter¬ 
ation  number  8  because  until  then  not  all  nodes  have  valid  (finite)  estimates,  (b) 
Performance  of  2-hop  OSE  with  link  failures.  All  simulations  are  run  with  flagged 
initialization.  Two  different  failure  probabilities  are  compared  with  the  case  of  no 
failure.  With  higher  probability  of  failure,  performance  degrades  but  the  error  is 
seen  to  decrease  with  iteration  count  even  with  large  failure  probabilities. 
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3.4.2  Energy  cost  comparison 


For  ad-hoc  wireless  network  applications,  the  primary  metric  for  comparison 
between  the  algorithms  described  above  is  not  the  number  of  iterations  required 
to  drive  the  error  below  a  certain  value,  but  the  average  energy  consumed  in  order 
to  do  so.  The  reason  is  that  in  ad-hoc  wireless  sensor  networks,  one  of  the  main 
challenges  is  to  keep  the  network  functional  for  an  extended  period  of  time  in 
spite  of  the  small  battery  life  of  the  sensors  [84] .  Reducing  energy  consumption  is 
therefore  critical. 

The  OSE  algorithm  converges  faster  than  both  Jacobi  and  EPA.  However, 
faster  convergence  is  achieved  at  the  expense  of  each  node  sending  and  processing 
more  data.  One  may  then  ask  whether  there  is  a  significant  advantage  to  using 
the  OSE  algorithm.  However,  the  energy  cost  of  sending  additional  data  can 
be  negligible  due  to  the  complex  dependence  of  energy  consumption  in  wireless 
communication  on  radio  hardware,  underlying  PHY  and  MAC  layer  protocols, 
network  topology  and  a  host  of  other  factors. 

Investigation  into  energy  consumption  of  wireless  sensor  nodes  has  been  rather 
limited.  Still,  we  can  get  an  idea  of  which  parameters  are  important  for  energy 
consumption  from  the  studies  reported  in  [85-87].  It  is  reported  in  [87]  that 
for  very  short  packets  (in  the  order  of  100  bits),  transceiver  startup  dominates 
the  power  consumption;  so  sending  a  very  short  message  offers  no  advantage  in 
terms  of  energy  consumption  over  sending  a  somewhat  longer  message.  In  fact, 
in  a  recent  study  of  dense  network  of  IEEE  802.15.4  wireless  sensor  nodes,  it  is 
reported  in  transmitted  energy  per  bit  in  a  packet  decreases  monotonically  upto 
the  maximum  payload  [85].  One  of  the  main  findings  in  [86]  was  that  in  highly 
contentious  networks,  “transmitting  large  payloads  is  more  energy  efficient”.  On 
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the  other  hand,  receive  and  idle  mode  operation  of  the  radio  is  seen  to  consume  as 
much  energy  as  the  transmit  mode,  if  not  more  [88].  Thus,  the  number  of  packets 
sent  and  received  appear  to  be  a  better  measure  to  predict  energy  consumption 
than  the  number  of  bits. 

Due  to  the  reasons  outlined  above,  we  take  the  number  of  packets  transmitted 
and  received  by  a  node  as  a  measure  of  its  energy  consumption.  Let  N^(u)  be 
the  number  of  packets  a  node  u  transmits  to  its  neighbors  during  the  ith  iteration. 
The  energy  E^\u)  expended  by  u  in  sending  and  receiving  data  during  the  ith 
iteration  is  computed  by  the  following  formula: 

E^(u)  =  i\r£>( u)  + 1  XI  JVteM.  (3.25) 

veJSu 

where  Nu  is  the  set  of  neighbors  of  u.  The  factor  3/4  is  chosen  to  account  for 
the  ratio  between  the  power  consumptions  in  the  receive  mode  and  the  transmit 
mode.  Our  choice  is  based  on  values  reported  in  [85]  and  [89].  The  average  energy 
consumption  E(e)  is  the  average  (over  nodes)  of  the  total  of  energy  consumed 
among  all  the  nodes  till  the  normalized  error  reduces  to  e.  For  simplicity,  eq.  (3.25) 
assumes  synchronous  updates  and  perfect  communication  (no  retransmissions). 
When  packet  transmission  is  unsuccessful,  multiple  retransmissions  maybe  result, 
making  the  resulting  energy  consumption  a  complex  function  of  the  parameters 
involved  [85,  86]. 

In  one  iteration  of  the  Jacobi  algorithm,  a  node  needs  to  broadcast  its  own 
estimate,  which  consists  of  k  real  numbers.  Recall  that  k  is  the  dimension  of  the 
node  variables.  Assuming  a  32  bit  encoding,  that  amounts  to  4 k  bytes  of  data.  In 
the  OSE  algorithm,  a  node  with  d  neighbors  has  to  broadcast  data  consisting  of 
4 d  bytes  for  its  neighbors’  IP  addresses,  4k(d  +  l)  bytes  for  the  previous  estimates 
of  itself  and  its  neighbors,  and  3 d  bytes  for  time  stamps  of  those  estimates.  This 
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leads  to  a  total  of  (7  +  4 k)d  +  4 k  bytes  of  data,  and  consequently  the  number  of 
packets  in  a  message  becomes 


(7  +  4  k)d  +  4  k 

Ntx(u )  =  r„._  jzTzmT: 


(3.26) 


maxjdatabytesjpkt 

where  maxjdatabytesjpkt  is  the  maximum  number  of  bytes  of  data  allowed  in  the 
payload  per  packet.  We  assume  that  the  maximum  data  per  packet  is  118  bytes, 
as  per  IEEE  802.15.4  specifications  [90].  For  comparison,  we  note  that  the  number 
of  bytes  in  a  packet  transmitted  by  MICA  motes  can  vary  from  29  bytes  to  250 
bytes  depending  on  whether  B-MAC  or  S-MAC  is  used  [91].  If  the  number  of 
data  bytes  allowed  is  quite  small,  OSE  may  require  multiple  packet  transmission 
in  every  iterations,  making  it  more  expensive. 


The  average  energy  consumption  E  of  the  three  algorithms  -  Jacobi,  modified 
EPA  and  2-hop  OSE  -  are  compared  in  Figure  3.13.  Flagged  initialization  was 
used  in  all  three  algorithms.  To  compute  the  energy  consumption  for  the  2- 
hop  OSE,  we  apply  (3.26)  with  k  —  2  and  maxjdatabytesjpkt  =  118  to  get 
Ntx{u )  =  [(15du  +  8)/118].  The  average  node  degree  being  5,  the  number  of 
packets  broadcasted  per  iteration  in  case  of  the  OSE  algorithm  was  1  for  almost 
all  the  nodes.  For  Jacobi,  the  number  of  packets  broadcasted  at  every  iteration 
was  1  for  every  node.  For  the  modified  EPA  algorithm,  the  number  of  packets  in 
every  transmission  was  1  but  the  total  number  of  transmissions  in  every  iteration 
were  larger  (than  Jacobi  and  OSE)  due  to  the  data  exchange  required  in  both 
the  EPA  update  and  EPA  solve  steps  (see  [69]  for  details).  The  normalized  error 
against  the  average  (among  all  the  nodes)  total  energy  consumed  E  is  computed 
and  plotted  in  Figure  3.13.  Comparing  the  plots  one  sees  that  for  a  normalized 
error  of  1%,  the  OSE  consumes  about  70%  of  the  energy  consumed  by  modified 
EPA  and  60%  of  that  by  Jacobi.  As  lower  errors  are  demanded,  the  difference 
becomes  more  drastic:  to  achieve  a  normalized  error  of  0.8%,  OSE  needs  only  60% 
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Figure  3.13.  The  normalized  error  11  h^m  11  vs.  average  energy  consumption  of 
2-hop  OSE,  modified  EPA  and  Jacobi  with  broadcast  communication.  Flagged 
initialization  was  used  in  all  the  three  algorithms. 

of  the  energy  consumed  by  EPA  and  about  half  of  that  by  Jacobi. 

Note  that  the  energy  consumption  benefits  of  OSE  become  more  pronounced 
as  higher  accuracy  is  demanded,  but  less  so  for  low  accuracy.  This  feature  is 
due  to  the  flagged  initialization,  which  accounts  for  almost  all  the  error  reduction 
in  the  first  few  iterations.  Note  that  the  energy  savings  in  OSE  will  occur  only 
if  the  extra  data  can  be  packed  into  a  small  number  of  packets.  In  such  cases, 
the  OSE  algorithm  is  advantageous  compared  to  the  Jacobi  algorithm  because 
OSE  requires  a  smaller  number  of  iterations  -  and  therefore  a  smaller  number  of 
messages  -  compared  to  Jacobi  to  achieve  a  desired  error  tolerance,  resulting  in 
lower  energy  consumption  and  increased  network  life. 
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3.5  Effect  of  asymmetric  communication 


In  Section  3.3  we  saw  that  if  the  communication  graph  is  asymmetric,  the 
Jacobi  algorithm  does  not  converges  to  the  optimal  estimate.  This  raises  two 
questions: 

1.  Is  it  possible  to  design  a  distributed  algorithm  that  converge  to  the  optimal 
estimate  even  in  the  presence  of  asymmetric  communication? 

2.  How  sensitive  is  the  Jacobi  algorithm  (or  some  other  distributed  algorithm) 
to  the  level  of  asymmetry?  In  other  words,  does  increasing  the  level  of 
asymmetry  make  the  variance  of  the  estimates  (that  the  algorithm  converges 
to)  larger? 

3.5.1  An  impossibility  result 

The  answer  to  the  first  question  is,  no.  This  can  be  seen  by  the  example  shown 
in  Figure  3.14,  where  the  reference  variable  is  X\  =  0  and  the  measurement  error 
variances  are  equal.  Consider  the  case  when  the  communication  graph  (shown  in 
Figure  3.14)  is  time-invariant.  It  satisfies  the  conditions  of  Theorem  3.3.2,  and 
therefore  the  Jacobi  algorithm  converges.  However,  due  to  the  asymmetry  in  the 
communication  graph,  the  limiting  estimate  will  be  different  from  the  optimal 
estimate.  It  is  clear  from  the  figure  that  due  to  the  information  flow  structure 
imposed  by  the  communication  graph,  node  2  will  only  have  information  of  the 
reference  variable,  which  is  0,  and  the  measurement  ()i2-  The  optimal  estimate  of 
X2  is,  however,  a  combination  of  all  three  measurements:  £2  =  —  §Ci2  —  ^ (C13  —  C32) - 
Clearly  no  distributed  algorithm  can  converge  to  the  optimal  estimate,  since  in¬ 
formation  on  £13  will  never  reach  node  2.  Even  if  nodes  are  allowed  to  transmit 
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Figure  3.14.  A  measurement  and  communication  graph  pair  in  which  it  is  im¬ 
possible  for  any  distributed  algorithm  to  converge  to  the  optimal  estimate.  The 
difficulty  arises  from  the  asymmetry  in  the  communication  graph  that  prevents 
information  of  the  relative  measurements  on  certain  edges  from  reaching  certain 
nodes. 

their  neighbors’  information  in  addition  to  their  own,  as  done  in  the  OSE  algo¬ 
rithm,  similar  examples  can  be  constructed  that  shows  the  impossibility  of  optimal 
estimation  in  the  presence  of  communication  asymmetry. 

ft  is  important  to  keep  in  mind  that  when  we  say  no  distributed  algorithm 
can  converge  to  the  optimal  estimate  when  communication  asymmetric,  we  are 
talking  about  algorithms  that  satisfy  the  Constraint  3.1. 

3.5.2  More  measurements  need  not  reduce  error 

Another  important  effect  of  asymmetric  communication  is  that  using  more 
measurements  need  not  lead  to  more  accurate  estimates  of  all  node  variables  - 
the  variance  of  some  of  the  node  variables’  estimation  error  can  in  fact  increase. 
When  communication  is  symmetric,  the  Jacobi  algorithm  converges  to  the  opti¬ 
mal  estimate.  The  optimal  estimate  has  the  property  that  its  estimation  error 
variance  can  only  decrease  upon  using  more  measurements.  This  follows  from  the 
so-called  Rayleigh’s  monotonicity  law  of  effective  resistances;  see  Theorem  4.6.1  in 
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Chapter  4  for  details.  We  conclude  that  with  symmetric  communication,  having 
more  measurement  edges,  regardless  of  the  associated  error,  produces  more  accu¬ 
rate  (less  variance)  estimates  when  either  the  Jacobi  or  the  OSE  algorithm  is  used. 
However,  the  presence  of  asymmetry  in  the  communication  graph  destroys  this 
monotonicity.  We  illustrate  this  effect  of  asymmetry  with  a  particularly  troubling 
example,  where  the  addition  of  a  measurement  edge  causes  the  error  variances  of 
all  the  node  estimates  to  increase. 

Figure  3.15  shows  two  measurements  graphs  Q\  and  Q2  and  their  associated 
communication  graphs  Q\  and  The  measurement  graph  Q\  contains  all  the 
nodes  and  edges  of  the  measurement  graph  Q2.  Similarly,  Q\  contains  all  the  nodes 
and  edges  in  Q\.  Every  measurement  error  variance  in  both  the  measurements 
graphs  is  unity.  The  estimation  error  variances  of  the  limiting  estimates  x°° 
computed  from  Lemma  3.3.1  are  shown  alongside  the  graphs.  It  is  clear  from  the 
variances  that  the  estimates  in  Qi  are  poorer  than  those  in  Q2,  even  though  Q\ 
contains  more  measurements  than  Q2. 


3.6  Comments  and  open  problems 

Among  the  questions  left  unanswered  in  this  Chapter,  perhaps  the  most  im¬ 
portant  ones  are  on  distributed  algorithms  for  estimating  time-varying  node  vari¬ 
ables  and  establishing  asynchronous  convergence  results  for  the  Jacobi  and  OSE 
algorithms  in  the  general  case  of  arbitrary  positive  definite  edge  covariances  (i.e., 
without  Diagonal  Assumption  3.3.1). 

Distributed  estimation  of  node  variables  that  are  changing  with  time,  i.e.,  that 
have  dynamics,  is  a  challenging  problem.  It  is  equivalent  to  distributed  Kalman 
filtering,  which  is  an  open  research  problem. 
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a\  =  0.67 ,uf  =  0.67,  a\  =  1.67. 


Figure  3.15.  An  example  of  more  measurements  reducing  the  estimation  accuracy 
of  the  Jacobi  algorithm  when  communication  is  asymmetric.  The  figure  shows  two 
measurements  graphs  Q\  and  ,  and  their  associated  communication  graphs  Q\ 
and  0%.  The  variance  of  every  measurement  error  is  1  for  both  the  measurement 
graphs.  The  estimation  error  variances  of  the  limiting  estimate  (that  the  Jacobi 
algorithm  converges  to)  computed  from  Lemma  3.3.1  are  shown  alongside  the 
graphs.  Even  though  tj2  C  Q\  and  C  Q\,  the  resulting  estimation  error  variances 
are  still  higher  in  (Gi,Gi)  than  in  (</2,£/f). 

To  prove  convergence  of  the  asynchronous  version  of  the  algorithms  (both 
Jacobi  and  OSE),  we  had  to  assume  a  special  structure  of  the  measurement  error 
covariance  matrices  (see  Diagonal  Assumption  3.3.1).  The  reason  is  the  following. 
Consider  a  linear  system  of  the  form  Lx  =  b  E  Mn  with  L  E  Mnxn  non-singular, 
and  let  L  =  M  —  N  be  a  splitting  of  L,  i.e. ,  M  is  non-singular.  Define  the  iteration 
operator 

J  :  -»■  Mn,  x  ->  M~\Nx  +  b). 
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For  the  synchronous  iteration  to  converge  to  the  solution  L~1b1  we  need  p(J)  <  1. 
The  asynchronous  iteration  (corresponding  to  the  synchronous  one  above)  will 
also  converge  to  the  solution  for  every  initial  condition,  if  p(\J\)  <  1,  where  \J\ 
represents  the  matrix  obtained  by  replacing  all  the  entries  of  J  with  their  absolute 
values  [92],  This  condition  is  also  necessary,  i.e.,  if  p(|  J|)  >  1,  then  there  exists  an 
initial  condition  and  sequence  of  communication  faults  for  which  the  asynchronous 
iteration  will  not  converge  to  the  solution  of  Ax  =  b  [92] .  The  reader  is  advised  to 
see  [92]  and  references  therein  for  a  review  and  historical  perspective  on  the  subject 
of  asynchronous  parallel  iterations.  The  Diagonal  Assumption  3.3.1  allowed  us  to 
reduce  the  problem  to  the  special  case  of  scalar  valued  node  variables,  i.e.,  k  —  1. 
The  case  k  —  1  offered  us  two  distinct  advantages,  one,  the  iteration  matrix 
J  =  M~1N  turned  out  to  be  non-negative,  so  we  we  only  had  to  prove  p(J)  <  1. 
The  second  advantage  is  that  when  k  =  1,  M  —  N  turned  out  to  be  an  M-matrix, 
which  allowed  us  to  exploit  the  available  results  in  the  extensive  literature  on 
M-matrices  and  convergence  of  parallel  iterative  methods  to  show  that  the  OSE 
algorithm  converges. 

In  the  general  case  the  edge  weights  are  positive-definite  matrices  and  not 
positive  scalars,  J  =  is  not  a  non- negative  matrix  and  L  =  M  —  N  is 

not  a  M-matrix.  In  fact,  for  the  measurement  network  described  in  Section  3.3.3 
for  which  numerical  simulations  were  conducted,  p\J\  >  1.  Therefore  an  asyn¬ 
chronous  scheme  will  not  converge  in  general.  However,  in  both  the  algorithms 
the  components  of  a  node  variable  are  always  transmitted  together,  so  the  asyn¬ 
chronous  iterations  that  are  of  interest  to  us  have  a  special  structure.  Moreover, 
the  special  structure  of  L  indicates  that  thinking  of  it  as  a  “block”  M  matrix, 
the  Jacobi  algorithm  still  might  lead  to  a  provably  convergent  asynchronous  it¬ 
eration.  Such  generalization  of  M-matrices  have  in  fact  been  attempted,  though 
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in  a  quite  restrictive  sense  in  most  cases  [93-96].  We  believe  there  is  hope  for 
proving  asynchronous  iteration  convergence  with  block  M-matrices,  but  in  order 
to  do  that  a  research  program  in  generalizing  M-matrices  to  block  M-matrices  has 
to  be  undertaken  first.  So  significant  technical  hurdles  remain. 

Another  useful  research  direction  is  the  investigation  of  convergence  rates  with 
random  communication  faults.  For  this,  second  moment  convergence  has  to  be 
established  first.  Convergence  rate  on  the  second  moment  of  the  error  with  ran¬ 
dom  link  failures,  even  for  simple  failure  distributions  will  be  quite  useful  to  the 
practitioner.  Since  the  motivation  for  parallel  iterative  methods  has  tradition¬ 
ally  been  solution  of  large  problems  in  clusters  of  powerful  machines,  which  are 
connected  over  a  wired  network,  asynchrony  usually  comes  from  delays  in  the 
wired  network  and  varying  processor  speeds.  As  a  result,  in  the  vast  literature 
on  parallel  iterative  methods,  analysis  of  convergence  with  random  faults  is  rare 
(one  notable  exception  being  the  work  of  Strikwerda  [97]).  With  the  recent  inter¬ 
est  in  distributed  computation  in  wireless  networks,  analysis  of  parallel  iterative 
methods  with  random  communication  faults  may  be  quite  useful. 

No  analytical  results  on  the  convergence  rate  of  the  OSE  algorithm  were  ob¬ 
tained  here.  It  was  shown  through  simulations  to  converge  faster.  Establishing 
convergence  rate  of  the  Weighted  Additive  Schwarz  method,  to  which  OSE  is 
closely  related,  is  recognized  to  be  quite  challenging  [83].  Obtaining  convergence 
rate  of  the  OSE  algorithm  is  therefore  a  challenging  open  problem,  but  one  that 
is  also  of  interest  to  a  wider  community. 
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3.7  Proofs 


In  the  proofs,  we  will  use  properties  of  non-negative  matrices  and  M  -  matrices. 
First  we  show  the  following: 

Proposition  3.7.1.  The  matrix  Lc  defined  in  (3.12)  for  the  graph  pair  ( G,GC ), 
for  the  special  case  of  k  =  1,  is  an  M -matrix  as  long  as  Assumption  3.1.1  is 
satisfied.  □ 

Proof  of  Proposition  3.7.1.  In  the  case  k  —  1,  the  measurement  error  covariance 
Pe  on  edge  e  is  simply  a  variance  a2.  Setting 

s  '■=  rna x[D]UtU, 

B  :=  si  —  Lc, 

and  applying  the  Gerschgorin  circle  theorem  [75],  we  conclude  that  p(B )  <  s. 
This  proves  that  Lc  is  an  M-matrix.  ■ 

Proposition  3.7.2  (Theorem  2.3(N45)  of  [78]).  Let  X  be  a  matrix  of  class  Z , 
that  is,  it  is  a  real  square  matrix  whose  off-diagonal  terms  are  non-positive.  Then 
the  following  statements  are  equivalent: 

1.  there  exist  a  representation  X  —  K  —  Q  with  K~l  >z  0,  Q  >z  0  such  that 
p(M~lN)  <  1. 

2.  X  is  a  non-singular  M -matrix.  □ 

Proof  of  Lemma  3.3.2.  Consider  the  following  discrete-time  dynamical  system: 

e(*+1)  —  M_1ATe(i),  (3.27) 
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where  eQ)  G  Mn*\  Note  that  M  is  invertible  by  construction  since  it  is  a  diagonal 
matrix  and  every  diagonal  element  is  positive,  which  is  guaranteed  by  Assump¬ 
tion  3.1.1.  Due  to  the  structure  of  the  matrices  M-1  and  N,  (3.27)  implies  that 
in  every  iteration  i,  each  node  u  G  ‘V  computes  its  new  state  ezu  as  the  weighted 
average  of  the  states  of  those  nodes  v  that  have  an  edge  (v,u)  directed  from  v 
toward  u  in  the  graph  Qc.  In  other  words,  it  is  a  distributed  average-consensus  al¬ 
gorithm  where  the  reference  nodes  keep  their  values  at  0,  and  the  remaining  nodes 
try  to  reach  consensus  by  averaging  with  their  neighbors.  The  system  (3.27)  sat¬ 
isfies  the  strict  convexity  assumption  of  [98].  Thus,  from  Theorem  2  of  [98]  we 
know  that  the  system  (3.27)  is  uniformly  globally  attractive  with  respect  to  the 
collection  of  equilibrium  points  (which  in  this  case  is  {0})  if  and  only  if  for  every 
node  «G‘1,\  J/r,  there  is  at  least  one  reference  node  such  that  there  is  a  directed 
path  in  Qc  from  the  reference  ode  to  u.  Note  that  here  we  have  used  a  slight 
specialization  of  the  results  in  [98]  to  the  case  when  one  or  more  agents  do  not 
participate  in  the  consensus  algorithm  but  keep  their  values  fixed.  On  the  other 
hand,  it  follows  from  (3.27)  that  eW  — ■>  0  as  i  — ■>  oo  if  and  only  if  p(M~lN )  <  1. 
Since  Lc  is  an  M-matrix,  it  follows  from  Proposition  3.7.2  Lc  =  M  —  N  is  non¬ 
singular  if  and  only  if  for  every  node  u  G  ‘V  \  there  is  at  least  one  reference 
node  such  that  there  is  a  directed  path  in  Qc  from  the  reference  ode  to  u.  This 
proves  the  first  statement  of  the  theorem.  ■ 

Proof  of  Theorem  3.3.3.  We  only  consider  the  case  k  —  1  in  the  proof.  The  case 
for  k  >  1  follows  from  Assumption  3.3.1,  as  explained  earlier. 

When  no  edge  or  node  fails  permanently,  the  Jacobi  algorithm  with  time- varying 
communication  qualifies  as  an  asynchronous  iteration  as  defined  in  [92,  Defn.  2.2], 
Now  we  use  Theorem  4.1  from  [92]  which  states  that  the  asynchronous  iteration 
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corresponding  to  the  synchronous  iteration 


x  M  1(Nx  +  b) 

converges  to  the  solution  of  (. M—N)x  =  b,  if  the  following  conditions  are  satisfied: 

1.  M  is  non-singular, 

2.  p(\A/I~1N\ )  <  1,  where,  for  a  matrix  X,  \X\  denotes  the  matrix  obtained  by 
replacing  every  entry  of  X  with  its  absolute  value. 

The  first  condition  is  satisfied  by  Assumption  3.1.1  as  proved  in  Proposition  3.3.1. 
From  the  condition  on  existence  on  paths,  it  follows  that  p(M~1N)  <  1  (see 
Lemma  3.3.2).  Because  of  the  non-negativity  of  M  and  N,  which  follows  from 
Diagonal  Assumption  3.3.1,  we  get  p(|M_1AT|)  <  1.  This  proves  the  first  part  of 
the  theorem. 

If  the  existence  of  path  condition  is  violated,  it  follows  from  Lemma  3.3.2  that 
p(M~1N)  >  1,  which  means  the  algorithm  will  not  converge.  If  a  communication 
edge  or  node  fails  permanently,  or  a  communication  edge  that  is  not  in  Cj?nit 
becomes  active  at  a  later  time  and  is  active  infinitely  often,  then  we  can  construct  a 
new  “initial”  communication  graph  Q2  that  includes  that  communication  edge,  and 
apply  the  arguments  above  to  conclude  that  the  Jacobi  algorithm  converges,  but 
to  the  solution  of  Lc 2x  =  bc2  that  is  defined  for  (Q,  Q'2).  Assumption  3.1.1  ensures 
that  Lc  changes  if  the  communication  graph  Q-nit  is  changed.  Therefore,  the  Jacobi 
algorithm  cannot  converge  to  the  solution  of  Acx  =  b&  if  a  communication  edge 
that  is  not  in  Q[nit  becomes  active  at  a  later  time  and  is  active  infinitely  often. 

To  prove  almost  sure  convergence  in  the  presence  of  random  failures,  define  the 


110 


events: 


C  =  {  The  Jacobi  algorithm  converges} 

Sj  =  {  The  communication  edge  e  is  active  in  time  j} 


Aj  —  C\e£<£cSk 
Aj  i.o.  =  n°h1  U e>j  At 

Occurrence  of  the  event  Aj  means  all  the  communication  edges  were  active  in  time 
j,  i.e. ,  no  edge  or  node  failed  in  that  time.  If  Aj  occurs  for  infinitely  many  j' s, 
i.e.,  if  Aj  i.o.  occurs,  then  the  algorithm  converges,  since  two  sufficient  conditions 
mentioned  above  for  the  convergence  of  the  asynchronous  Jacobi  algorithm  are 
satisfied.  Therefore, 

P(C)  >  P (Aj  i.o.  ),  (3.28) 

where  P(-)  denotes  probability.  We  will  now  show  that  the  right  hand  side  above 
is  1.  Since  Aj  occurs  if  and  only  if  every  communication  edge  and  every  node  is 
active  in  time  j,  we  have 

P(^)  =  (l~P)m(l-9)n,  Vj. 

which  is  a  positive  number  (since  p,  q  <  1)  that  does  not  depend  on  j.  It  follows 
that 

OO 

3= 1 

and  moreover,  the  events  {Aj}  are  independent  since  the  node  or  edge  failures  at 
a  time  step  are  independent  of  failures  in  the  past  and  the  future.  Therefore,  by 
the  Borel-Cantclli  lemma  [99],  we  get 

p(A'  Lo-)  =  !• 


Ill 


It  follows  from  (3.28)  that 


P(C)  =  1, 


which  shows  that  the  Jacobi  algorithm  converges  almost  surely. 


Proof  of  Theorem  3.3.4 ■  Recall  the  decomposition  Lc  =  M  —  N  with  M  diagonal. 
Due  to  the  assumption  of  symmetric  communication,  L  =  Lc  and  the  matrices 
L,  M ,  N  are  symmetric.  The  iteration  counter  is  updated  simultaneously  by  all 
the  nodes.  The  error  in  the  ?'th  iteration,  —  x*  propagates  as 

e«  =  jW0),  where  J  :=  M~lN. 


Since  the  communication  graph  is  symmetric,  it  follows  from  Theorem  3.3.2  that 
the  Jacobi  algorithm  converges  to  the  optimal  estimates,  and  p(J)  <  1.  It  follows 
from  the  above  that 


3(0)  I 


<  P(jy  =>  £«>  <  p(jy, 


where  the  second  inequality  follows  from  the  definition  of  the  normalized  error 
in  (3.3).  It  is  easy  to  see  from  the  above  that 

l°ge  I 


^hter(^)  j 

From  Theorem  5.6  of  [78],  we  have 

p(L~lN ) 


log  p(J)  | 


1- 


p(J)  = 


=  1  - 


1 


(3.29) 


(3.30) 


\±p(L  hV)  l  +  p^"1^)’ 

Now,  L~XN  =  L~lM  —  /,  and  the  eigenvalues  of  L_1M  are  the  same  as  those  of 
where  is  the  unique  positive  definite  square  root  of  the  positive 
definite  diagonal  matrix  M,  we  get 


p(L~lN)  =  p(M^L~1M^)  -  1 

=  \\M^L~1M^\\-1,  (3.31) 
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where  ||  ■  ||  represents  the  matrix  2-norm  and  the  second  equality  follows  from  the 
matrix  being  symmetric  positive  definite.  From  the  definition  of  the 

matrix  2-norm,  we  have 


||  M^L~XM^ 


yTM?L  1Miy 

max - — - 

y^o  y1  y 


zTL~lz 


max 


z+o  zTM~xz 


where  z  :=  M^y.  It  is  straightforward  to  see  that  for  every  vector  z,  zTM  1z  > 
— K-^zTz.  Therefore, 

dmax(P)  ’ 


sr-1 


II  M*L 


Mh 


<  max 

z^O 


zTL~1z 

ZTZ 


dmax(-P )  — 


dmax  (P  ) 
Amin  w 


where  Amin(L)  denotes  the  smallest  eigenvalue  of  L.  It  follows  from  (3.31)  that 


p(L~lN)  <  TV  -  1 

^max(-P  ) 


(  from  (3.30)) 


It  follows  from  the  Gerschgorin  circle  theorem  [100,  pg.  498]  that  all  eigenvalues 
of  L  are  less  than  dmax,  so  ^4^  <  1.  Taking  logarithm  of  both  sides  of  the 
inequality  above,  and  using  log(l  —  x)  >  x  for  0  <  x  <  1,  we  get 


log  ms)  I  > 

^max 


Plugging  it  back  in  (3.29),  we  get 

Phter(c)  ^  |~dn 


I  ioge| 

Amin  (L) 


1 


which  proves  the  upper  bound  on  the  number  of  iterations  required.  The  proof  of 
the  lower  bound  is  similar,  and  is  therefore  omitted.  ■ 
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Chapter  4 


Optimal  estimation  in  infinite 
graphs  and  electrical  analogy 

4.1  Introduction 

In  this  chapter  we  take  the  first  steps,  and  develop  the  tools  needed,  toward  an¬ 
swering  the  error  scaling  question  raised  in  Chapter  1.  As  discussed  in  Chapter  2, 
we  examine  the  error  scaling  of  the  optimal  estimator  (BLUE),  since  it  achieves 
the  minimum  variance  among  all  linear  unbiased  estimators,  and  therefore  gives 
us  an  algorithm  independent  limit  on  estimation  accuracy. 

We  focus  our  attention  on  large  measurement  graphs  in  answering  the  error 
scaling  question.  A  large  measurement  graph  can  result  from  a  sensor-actuator 
network  obtained  by  deploying  a  large  number  of  nodes.  Sensor  networks  consist¬ 
ing  of  thousands  of  wireless  nodes  are  already  in  test  and  deployment  phase  [13]. 
Large  networks  are  being  envisioned  for  civil  as  well  as  defense  applications  [101]. 
Large  measurement  graphs  can  also  result  with  a  small  number  of  physical  agents 
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that  are  mobile,  since  in  that  case  the  measurement  graph  consists  of  all  the  vari¬ 
ables  and  measurements  that  appear  over  a  time  interval  of  interest.  For  example, 
in  the  multi-robot  localization  example  described  in  Section  2.1.4,  the  nodes  of  the 
measurement  graph  are  the  positions  of  the  mobile  robots  at  various  time  instants. 
Upon  collecting  all  the  relative  measurements  over  a  time  interval  (0,f),  we  ob¬ 
tain  a  measurement  graph  Q{t).  In  an  experiment  described  in  [102],  a  team  of  80 
robots  were  deployed  in  a  “search  and  protect”  mission.  If  every  robot  takes  rel¬ 
ative  position  measurements  every  minute  with  three  other  robots  on  an  average, 
after  an  hour  the  measurement  graph  will  consist  of  4800  nodes  and  14400  edges. 
Thus,  large  measurement  graphs  are  quite  likely  in  sensor-actuator  networks. 

To  investigate  error  scaling  in  a  large  graph,  we  consider  the  limiting  situation 
of  an  infinite  graph,  with  a  countable  number  of  nodes  and  edges  (i.e.,  variables 
and  relative  measurements).  The  results  in  this  Chapter  show  that  as  one  con¬ 
siders  larger  and  larger  numbers  of  measurements,  the  minimum  estimation  error 
covariance  of  a  node  variable  tends  to  a  limiting  covariance  matrix  that  is  posi¬ 
tive  definite.  This  limiting  covariance  is  characterized  by  the  effective  resistance 
in  an  abstract  electrical  network  in  which  currents,  voltages  and  resistances  are 
matrix- valued.  The  main  assumption  needed  is  that  the  graph  must  have  bounded 
degree. 

ft  is  often  easier  to  establish  asymptotic  results  for  infinite  graphs  than  for  large 
finite  graphs,  since  boundary  effects  are  usually  weaker  in  infinite  graphs  than  in 
finite  graphs.  This  advantage  is  exploited  in  the  next  Chapter  in  obtaining  error 
scaling  laws  in  infinite  measurement  graphs.  The  results  of  this  chapter  provides 
a  formal  justification  for  regarding  infinite  graphs  as  suitable  proxies  for  large  but 
finite  graphs. 

Chapter  Organization:  After  stating  the  problem  addressed  in  this  Chapter 
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precisely  in  Section  4.2,  we  summaries  the  main  results  and  related  prior  work  in 
Section  4.3.  Generalized  electrical  networks  are  introduced  in  Section  4.4,  along 
with  a  few  technical  results  on  such  networks,  so  that  the  main  results  of  the 
chapter  can  be  stated,  which  is  done  in  Section  4.5.  We  go  back  to  discuss  in 
more  detail  generalized  electrical  networks  in  Section  4.6,  which  also  describes 
a  few  technical,  but  crucial  results  needed  to  establish  the  main  results  of  this 
Chapter  (and  also  of  the  next).  Section  4.7  provides  the  proof  of  the  main  results, 
which  is  followed  by  a  discussion  on  relevant  open  problems  in  Section  4.8. 


4.2  Problem  statement 

Consider  an  infinite  measurement  graph  Q  =  (if,  £)  with  a  single  reference 
node  o  G  if ,  where  the  node  set  if  and  edge-set  £  are  countably  infinite.  The 
measurement  error  covariances  are  specified  by  an  edge-covariance  function  P  : 
£  — >  Sfc+,  where  §fc+  is  the  set  of  k  x  k  symmetric  positive  definite  matrices.  The 
pair  ( Q ,  P )  is  called  a  measurement  network.  Imagine,  for  the  moment,  that  we  are 
interested  in  the  estimate  of  a  particular  node  u  G  if.  Let  Infinite  =  (^finite,  £fmite) 
be  a  finite  subgraph  of  the  infinite  graph  Q ,  that  contains  both  u  and  o,  i.e., 
^finite  C  If,  and  £finite  C  £.  One  can  regard  the  subgraph  finite  as  consisting  of 
those  measurements  that  are  processed  upto  some  time  t  <  oo.  Given  the  finite 
measurement  network  ( G finite,  P ),  it  is  straightforward  to  compute  the  optimal 
estimate  x*  (Infinite)  of  the  unknown  node  variable  xu  in  the  network  (finite ,  P)  ,  as 
described  in  Chapter  2.  The  covariance  matrix  of  the  error  in  this  estimate  is 

£«,o(£  finite)  :=  E[(xu  -  x*u  ((/finite))  “  K  (^finite  ))T] , 

which  exists  as  long  Q finite  is  weakly  connected  (see  Theorem  2.2.1).  As  time 
goes  by,  one  can  process  larger  subsets  of  measurements,  which  can  be  visualized 
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by  a  sequence  of  progressively  larger  finite  subgraphs  of  Q .  In  this  context,  we  are 
interested  in  studying  if  there  is  a  limit  to  the  estimation  accuracy  achievable  by 
processing  more  and  more  measurements.  Specifically,  we  want  to  know  if  there 
is  a  point  beyond  which  there  is  little  gain  in  processing  more  measurements, 
as  it  will  not  improve  the  estimate  of  xu  significantly.  This  raised  the  following 
question: 

Limiting  accuracy  in  infinite  graphs:  Consider  an  infinite  mea¬ 
surement  network  (G,  P),  with  the  graph  Q  =  (yrC,‘E)  that  has  a  single 
reference  node  o  G  V .  For  every  node  variable  xu,u  G  V,  what  is  the 
minimuml  possible  estimation  error  covariance  that  can  be  achieved  by 
using  an  arbitrarily  large,  finite  subset  of  the  measurements  available 
in  F? 

Once  we  characterize  the  minimum  possible  estimation  error  of  a  node  variable 
in  infinite  graphs,  in  Chapter  5  we  examine  how  this  error  varies  as  a  function 
of  the  node’s  distance  from  the  reference,  and  how  this  variation  depends  on  the 
structure  of  the  graph.  An  answer  to  the  error  scaling  question  raised  in  Chapter  1 
is  thus  answered. 

A  measurement  network  is  assumed  to  satisfy  the  following  assumption. 

Assumption  4.2.1  (measurement  network).  The  measurement  network  {G ,  P ) 
satisfies  the  following  properties: 

1.  The  graph  Q  is  weakly  connected. 

2.  The  graph  Q  has  a  finite  maximum  node  degree1. 

3.  The  edge-covariance  function  P  is  uniformly  bounded,  i.e.,  there  exists 
constant  symmetric  positive  matrices  Pmin,  -Pmax  such  that  PmhJk  <  Pe  < 
^maxhi  Ve  G  £. 

1The  degree  of  a  node  is  the  number  of  edges  that  are  incident  on  the  node.  An  edge  ( u,v ) 
is  said  to  be  incident  on  the  nodes  u  and  v. 
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4.  The  measurement  errors  on  distinct  edges  are  uncorrelated,  i.e. ,  if  e  and  e' 
are  two  distinct  edges  in  Q,  then  E[eee^]  =  0.  □ 

In  the  above,  for  two  matrices  A,  B  e  Mfcxfc,  A  >  (>)B  means  A  —  B  is  positive 
definite  (semidehnite),  which  is  also  written  as  A  —  B  >  (>)0.  We  write  A  <  (<)B 
if  -A  >  (>)  -  B. 

To  formulate  the  problem  of  determining  the  limiting  accuracy  in  infinite 
graphs  precisely,  we  define  the  limiting  BLUE  error  covariance  for  a  node 
variable  xu  in  an  infinite  measurement  network  (Q,  P )  as 

Su,0  inf  £«,o (^finite),  (4-1) 

^finite 

where  the  inhmum  is  taken  over  all  finite  subgraphs  Gs\n\te  of  Q  that  contain  the 
nodes  u  and  o.  We  define  a  matrix  M  e  E>k+  to  be  the  infimum  of  the  set  S, 
where  S  C  Sfc+,  and  write 

M  —  inf  S,  (4.2) 

if  M  <  A  for  for  every  matrix  A  e  S,  and  for  every  positive  real  e,  there  exists 
B  G  S  such  that  M  +  elk  >  B.  We  will  show  in  Section  4.5  that  under  Assump¬ 
tion  4.2.1,  the  inhmum  T,Uy0  in  (4.1)  is  well-defined  and  is  a  symmetric  positive 

definite  matrix.  In  the  sequel,  we  will  often  say  “limiting  BLUE  covariance  in  a 
network”  without  specifying  if  the  network  is  finite  or  infinite.  When  the  network 
is  finite,  this  phrase  should  be  understood  to  mean  simply  the  BLLIE  covariance. 

Note  that  the  BLUE  covariance  above  is  not  defined  in  terms  of  the  error  in 
the  estimate  obtained  by  using  all  the  infinite  number  of  measurements  available 
in  Q.  There  are  two  reasons  for  it.  First,  in  practice,  one  may  have  a  very  large 
number  of  measurements  but  never  an  infinite  number  of  them.  So  establishing 
the  limit  of  estimation  accuracy  that  is  achievable  by  using  arbitrarily  large  but 
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a  finite  number  of  measurements  is  more  relevant  from  a  practical  point  of  view. 
The  second  reason  is  that  characterizing  the  best  linear  unbiased  estimator  and 
its  covariance  for  an  infinite  number  of  measurements  is  technically  challenging. 


4.3  Contributions  and  prior  work 

The  results  established  in  this  chapter  are  the  following: 

1.  In  a  measurement  network  (finite  or  infinite),  the  limiting  BLUE  error  co- 
variance  T,Ut0  that  can  be  achieved  by  using  arbitrarily  large  finite  subsets  of 
measurements  is  equal  to  a  matrix-valued  effective  resistance  R^fv  between  u 
and  o  in  a  generalized  electrical  network,  in  which  currents,  voltages  and  re¬ 
sistances  are  matrix  valued.  The  electrical  network  is  obtained  by  assigning 
to  every  edge  of  the  measurement  graph  a  matrix- valued  resistance  equal  to 
the  covariance  of  the  measurement  error  on  that  edge.  This  result  is  called 
the  electrical  analogy. 

2.  We  show  that  for  every  positive  constant  e  >  0,  it  is  possible  to  construct 
an  unbiased  estimate  for  a  node  variable  xu  using  only  a  finite  subset  of 
the  available  measurements  but  whose  estimation  error  covariance  is  only 
e  above  the  minimum  possible  estimation  error  variance  that  could  be  ob¬ 
tained  by  considering  an  arbitrarily  large  number  of  the  available  measure¬ 
ments. 

This  convergence  result  provides  the  formal  justification  for  regarding  inh- 
nite  graphs  as  suitable  proxies  for  very  large  but  finite  graphs,  and  estab¬ 
lishes  the  conditions  under  which  such  approximation  is  valid.  In  particular, 
the  assumption  of  finite  maximum  node  degree  is  needed. 
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Another  implication  of  this  result  is  that  for  estimation  problems  based  on 
relative  measurements,  after  a  certain  point,  considering  more  measurements 
will  only  marginally  improve  the  quality  of  the  estimate.  On  the  positive 
side,  this  simplifies  the  construction  of  estimation  algorithms  in  large-scale 
networks  because  it  justifies  considering  a  relatively  small  subset  of  mea¬ 
surements.  Although  in  Chapter  3,  distributed  algorithms  were  proposed 
and  analyzed  to  compute  the  optimal  estimates,  for  a  large  network  these 
algorithms  may  take  a  long  time  to  provide  accurate  estimates.  The  reason 
is  that  information  about  all  the  available  measurements  are  fused  itera¬ 
tively  to  determine  the  estimate  of  every  node  variable.  The  results  of  this 
chapter  suggest  that  it  may  be  possible  to  devise  algorithms  such  that  they 
obtain  estimates  quite  fast,  while  sacrificing  little  accuracy. 

3.  The  BLUE  covariance  of  a  node  variable  can  only  decrease  upon  using  more 
measurements,  and  can  only  increase  upon  using  fewer  measurements.  This 
monotonicity  of  BLLIE  covariances  is  a  result  of  our  extension  of  a  classi¬ 
cal  result  for  electrical  networks,  named  Rayleigh’s  monotonicity  law  [2], 
to  generalized  electrical  networks.  This  monotonicity  property  of  matrix¬ 
valued  effective  resistances  is  crucial  not  only  in  proving  all  the  results  in 
this  chapter,  but  also  in  establishing  the  error  scaling  laws  in  Chapter  5. 

Prior  work:  to  the  best  of  our  knowledge,  the  question  on  limiting  accuracy 
in  infinite  graphs  posed  and  answered  in  this  chapter  has  not  been  investigated 
earlier.  The  analogy  between  estimator  variance  and  effective  resistance,  when  the 
node  variables  are  scalars,  was  noted  by  Karp  et  al.  [73]  in  connection  with  the 
time-synchronization  problem.  Here  we  show  that  an  electrical  analogy  still  holds 
for  vector- valued  node  variables,  provided  that  we  consider  generalized  electrical 
networks  in  which  currents,  voltages,  and  resistors  are  matrix- valued.  Another 
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important  distinction  with  [73]  is  that  the  measurement  graph  was  finite  in  Karp 
et  al.  [73],  whereas  we  allow  the  measurement  graph  to  be  infinite. 

The  extension  of  the  electrical  analogy  to  infinite  graphs  requires,  among  other 
things,  that  the  currents  and  voltages  in  an  infinite,  generalized  electrical  network 
are  well  defined.  Our  proof  that  this  is  so  parallels  the  work  of  Flanders,  who  in 
1971  provided  perhaps  the  earliest  exposition  of  infinite  electrical  networks  [103]. 
Although  our  proof  technique  is  different,  the  results  for  infinite  generalized  elec¬ 
trical  networks  are  direct  extensions  of  Flanders’  results  for  infinite  scalar  electrical 
networks. 


4.4  Generalized  electrical  networks 


A  generalized  electrical  network  ( G,R )  consists  of  a  graph  Q  =  (T/,)  Tf)  (finite 
or  infinite)  together  with  a  function  R  :  !£  — »  8fc+  that  assigns  to  each  edge  e  g  £ 
a  symmetric  positive  definite  matrix  Re  called  the  generalized  resistance  of  the 
edge. 


Recall  that  a  generalized  flow  from  node  u  €  V  to  node  v  £  with  intensity 
j  £  Rkxk  is  an  edge-function  j  :  “E  — >  such  that 
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(4.3) 


We  say  that  a  flow  i  is  a  generalized  current  when  there  is  a  node-function  V  : 
V  — >  M.kxk  for  which 


Ru,viu,v  fA  fA?  V(u,u)  £ 


(4.4) 
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The  node-function  V  is  called  a  generalized  potential  associated  with  the  current 
i.  Eq.  (4.3)  should  be  viewed  as  a  generalized  version  of  Kirchoff’s  current  law 
and  can  be  interpreted  as:  the  net  flow  out  of  each  node  other  than  u  and  v  is 
equal  to  zero,  whereas  the  net  flow  out  of  u  is  equal  to  the  net  flow  into  v  and 
both  are  equal  to  the  flow  intensity  j.  Eq.  (4.4)  provides  in  a  combined  manner, 
a  generalized  version  of  Kirchoff’s  loop  law,  which  states  that  the  net  potential 
drop  along  a  circuit  must  be  zero,  and  Ohm’s  law,  which  states  that  the  potential 
drop  across  an  edge  must  be  equal  to  the  product  of  its  resistance  and  the  current 
flowing  through  it.  A  circuit  is  an  undirected  path  that  start  and  end  at  the  same 
node.  For  k  —  1,  generalized  electrical  networks  are  the  usual  electrical  networks 
with  scalar  currents,  potentials,  and  resistors. 

The  energy  dissipated  by  an  edge-function  j  in  the  network  (Q,  R )  is  defined 
by 

Hill  :=  (J^trac e(jjReje)y.  (4.5) 

ee£ 

It  is  straightforward  to  verify  that  the  set  of  edge-functions  with  finite  dissipated 
energy  constitutes  a  Hilbert  space  Hr  with  inner  product  (j,  j)  =  ^ee£  trac e(jjReje), 
Vj,  j  G  Hr.  For  infinite  networks,  the  summation  in  (4.5)  is  an  absolutely  con¬ 
vergent  series  and  the  order  of  summation  is  irrelevant.  Flows  of  finite  support 
always  belong  to  Hr. 

4.4.1  Existence  and  uniqueness  of  generalized  current 

Existence  and  uniqueness  of  scalar  currents  in  infinite  networks  has  been  ex¬ 
amined  in  [103,  104],  It  was  shown  by  Flanders  that,  unlike  in  finite  networks,  in 
an  infinite  electrical  network  the  current  is  not  uniquely  determined  by  Kirchoff’s 
laws  and  Ohm’s  law  [103].  He  showed,  however,  that  uniqueness  of  current  in 
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an  infinite  network  can  be  established  if  two  additional  conditions  are  imposed: 
the  current  has  a  finite  dissipated  energy  and  it  is  the  limit  of  flows  with  finite 
support.  For  this  reason,  in  examining  the  uniqueness  of  generalized  currents  in 
infinite  networks  we  restrict  ourselves  to  generalized  flows  that  are  limits  of  finite 
support  flows  and  that  have  finite  dissipated  energy.  For  finite  networks  these 
conditions  hold  trivially. 

The  following  theorem  establishes  existence,  uniqueness,  and  linearity  of  gen¬ 
eralized  currents  and  potential  differences  in  generalized  electric  networks.  The 
proof  of  this  result  is  provided  in  Section  4.9. 

Theorem  4.4.1  (Generalized  Current).  In  a  generalized  electrical  network 
( Q,R )  that  satisfies  Assumption  4-2-1,  for  every  pair  of  nodes  u,  v  G  V  and  for 
every  intensity  i  €  Rkxk ,  among  all  flows  that  have  finite  dissipated  energy  and 
are  limits  of  finite  support  flows,  there  exists  a  unique  current  i  from  u  to  v  with 
intensity  i.  In  addition, 

1.  the  current  is  the  flow  that  minimizes  the  energy  dissipation,  among  all  flows 
from  node  u  to  node  v  with  intensity  i,  that  are  limits  of  finite  support  flows, 
and 

2.  the  current  i  and  the  potential  difference  Vp  —  Vq  (for  every  p,q  G  V)  are 

linear  functions  of  the  intensity  i.  The  potential  is  unique  only  upto  an 
additive  constant.  □ 

It  was  previously  known  that  in  a  scalar  electrical  network,  the  current  min¬ 
imizes  energy  dissipation.  This  result  is  known  as  Thomson’s  Minimum  Energy 
Principle  [2,  104],  Theorem  4.4.1  shows  that  generalized  currents  also  obey  Thom¬ 
son’s  Principle  in  both  finite  and  infinite  networks. 
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4.4.2  Generalized  effective  resistance 


It  was  shown  in  the  previous  section  that  the  potential  difference  Vu—Vv  G  M.kxk 
associated  with  a  current  of  intensity  i  G  Rkxk  flowing  from  u  to  v  is  a  linear 
function  of  i.  It  turns  out  that  this  linear  map  can  be  expressed  through  the 
matrix  multiplication  by  a  k  x  k  matrix,  which  is  stated  next.  The  proof  of  this 
result  is  provided  in  Section  4.9. 

Lemma  4.4.1.  Let  (Q,R)  be  a  generalized  electric  network  satisfying  Assump¬ 
tion  4.2.1.  The  linear  mapping  between  i  and  Vu  —  Vv  can  be  defined  by  multipli¬ 
cation  by  a  k  x  k  matrix,  which  we  call  the  generalized  effective  resistance  Rffiv 
between  u  and  v: 

Vu-Vv  =  Re^v i,  Vi  G  Rkxk.  □ 

In  the  sequel,  we  will  refer  to  generalized  effective  resistance  simply  as  effective 
resistance.  In  view  of  Lemma  4.4.1,  the  effective  resistance  between  two  nodes 
is  the  potential  difference  between  them  when  a  current  with  intensity  /*,,  the 
k  x  k  identity  matrix,  is  injected  at  one  node  and  extracted  at  the  other,  which  is 
analogous  to  the  definition  of  effective  resistance  in  scalar  networks  [2].  Moreover, 
the  effective  resistance  is  a  symmetric  positive-definite  matrix.  To  show  this,  we 
will  need  the  following  technical  result  (also  proved  in  Section  4.9),  which  will 
have  additional  usefulness  in  the  sequel. 

Lemma  4.4.2.  Let  i  G  Hr  be  the  unique  current  in  the  network  ( G,R )  with 
intensity  i  G  Rkxk  from  u  to  v,  and  let  j  be  a  flow  with  intensity  j  G  Rkxk  from  u 
to  v  that  can  be  expressed  as  a  limit  of  finite  support  flows.  Then, 

J2%Reje  =  (Vu~Vv)T j, 

ee£ 
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where  V  is  a  generalized  potential  associated  with  the  current  i.  Moreover,  the 


series  in  the  left-hand  side  converges  absolutely,  meaning  that  each  one  of  the  k2 
series  that  constitute  the  matrix-valued  left  hand  side  converges  absolutely.  □ 

To  prove  positive-definiteness  of  effective  resistances,  set  j  =  i  in  Lemma  4.4.2, 
where  both  i  and  j  have  intensity  Ik,  to  obtain 

5]  iTe  Reie  =  (K  -  K)T  =  «,)T,  (4.6) 

where  the  second  equality  follows  from  the  definition  of  effective  resistance  in 
Lemma  4.4.1.  Since  all  the  generalized  edge-resistances  Re  are  symmetric  and 
positive-definite,  we  conclude  that  the  left-hand  side  must  be  symmetric  and 
positive-definite,  which  confirms  that  effective  resistances  are  indeed  symmetric 
positive-definite.  This  is  stated  below  for  future  reference: 

Proposition  4.4.1.  For  every  pair  of  nodes  u,  v  in  a  generalized  electrical  network 
(< Q,P )  that  satisfies  Assumption  4-2.1,  the  generalized  effective  resistance  between 
them  Rffv  is  a  symmetric  positive  definite  matrix.  □ 


4.5  Main  results 

Given  a  measurement  network  ( Q,P ),  we  construct  an  analogous  generalized 
electrical  network  ( Q,P ),  that  is,  by  assigning  to  every  edge  a  matrix- valued 
resistance  equal  to  the  measurement  error  covariance  of  that  edge.  The  generalized 
effective  resistances  in  the  electrical  network  ( Q ,  P)  precisely  characterizes  the 
minimum  estimation  error  covariances  achievable  in  the  measurement  network 
(■ Q ,  P),  which  is  stated  in  the  next  theorem.  The  proof  of  the  theorem  is  provided 
in  Section  4.7. 
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Theorem  4.5.1  (Electrical  Analogy).  Consider  a  measurement  network  (G,  P ) 
satisfying  Assumption  4.2.1  with  Q  =  (44  E)  and  a  single  reference  node  o  G  44 
Then,  for  every  node  u  e  If  \  {o},  the  limiting  BLUE  error  covariance  Y>u^0  is 
given  by 

y  _  peff 

^u,o 

where  Rff0  is  the  effective  resistance  between  u  and  o  in  the  generalized  electrical 
network  (G ,  P) .  □ 

The  proof  of  this  result  will  follow  directly  from  a  more  general  result  that  also 
explains  what  happens  to  the  estimates  constructed  by  using  arbitrarily  large, 
finite  subsets  of  the  available  measurements.  To  state  the  result,  we  need  some 
preliminaries.  For  two  graphs  Q\  =  (44,  Ef),  Q2  =  (44,  Ef),  the  notation  Gi  C  Q2 
means  44  C  44  and  EL  C  E2.  We  now  consider  a  sequence  of  finite  measurement 
subgraphs  (fqn)  =  (4^ n),  *£(«) ) }  that  satisfies  the  following  assumption. 

Assumption  4.5.1  (Nested  Sequence).  A  sequence  of  finite  graphs  G(i),  G( 2), 
G( 3) ,  •  •  •  has  the  following  properties: 

1.  The  sequence  is  nested  in  the  sense  that 

0(1)  C  0(2)  C  0(3)  C  •  •  •  C  0, 

2.  The  sequence  converges  to  the  graph  Q  in  the  sense  that  every  node  and 
edge  in  Q  appears  in  one  of  the  0(n)  for  some  finite  n. 

3.  Each  finite  graph  G(n),  n  e  N  is  weakly  connected.  □ 

When  investigating  the  error  in  xuA  estimate,  every  graph  G(n)  in  such  a 
nested  sequence  of  finite  graphs  should  contain  the  reference  node  o  and  the  node 
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(a)  £(i) 


(b)  G(2) 


(c)  £(3) 


Figure  4.1.  A  nested  sequence  of  measurement  graphs  that  “tend  to”  the  2- 
dimensional  square  lattice. 

of  interest  u.  Figure  4.1  shows  the  first  few  elements  of  such  a  nested  graph 
sequence  that  will  eventually  converge  to  the  2-dimensional  square  lattice  (the 
formal  definition  of  a  lattice  will  be  provided  in  Section  5.5.1).  One  could  regard 
each  finite  subgraph  G(n)  as  describing  a  finite  subset  of  available  measurements 
that  could  be  processed  up  to  some  time  tn  <  oo  to  construct  an  estimate  of  xu. 
As  time  increases,  more  measurements  can  be  processed,  and  therefore  at  some 
time  tn+i  >  tn ,  the  subgraph  G(n+ 1)  contains  more  measurements  than  G(n)-  In 
this  context,  we  are  interested  in  studying  if  there  is  a  point  after  which  there  is 
little  improvement  in  estimation  error  upon  processing  more  measurements,  and 
whether  or  not  the  sequence  of  estimates  produced  using  the  nested  sequence  of 
subgraphs  converges. 

Let  x in')  denote  the  best  linear  unbiased  (BLU)  estimate  of  xu  in  the  finite 
measurement  network  (G(n),  P)',  Chapter  2  describes  how  to  compute  this  estimate. 
This  estimate  is  a  linear  combination  of  the  measurements  (e,e6  £  specified  by 
a  set  of  appropriately  chosen  coefficient  matrices.  In  particular,  the  BLU  estimate 
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is  given  by 


f<”>  =  X  Ci">TC,  (4.7) 

e££(n) 

where  the  function  C ^  :  <E{n)  — 1 ►  Mfcxfc  specihes  the  coefficients  of  the  measure¬ 
ments.  Note  that  in  the  equation  above,  and  in  the  sequel,  for  a  function  /  with 
the  edge  set  “E  as  the  domain,  we  use  fe  to  denote  the  value  of  the  function  at 
an  edge  e  G  “E.  We  call  the  function  the  BLU  estimator  for  xu  based  on  the 
finite  graph  Q(ny 

Every  estimator  C ^  can  be  viewed  as  an  element  of  the  real  linear  vector 
space  hip  consisting  of  all  edge- functions  of  the  form  C  :  “E  — >  M.kxk  for  which 

He'D2  :=  J^trac e(Cj PeCe)  <  oo,  (4.8) 

eSE 

where  each  Pe  denotes  the  error  covariance  matrix  for  the  measurement  associated 
with  the  edge  ee£.  It  is  straightforward  to  show  that  hip  is  a  Hilbert  space  with 
the  associated  inner  product  (C,  C)  =  Yhe&x  trace (CfPeCe),  VC,  C  G  hip.  We  say 
that  an  edge-function  in  hip  has  finite  support  if  it  has  only  a  finite  number  of 
nonzero  entries.  Since  all  the  sets  £(n)  in  (4.7)  are  finite,  every  estimator  C*(n)  is 
a  finite-support  edge-function  in  hip. 

For  infinite  graphs,  the  summation  in  (4.8)  is  actually  a  series.  However,  the 
series  is  absolutely  convergent  due  to  the  positive  definiteness  of  the  Pf  s,  hence 
the  order  of  the  summation  is  immaterial  and  therefore  the  expression  in  (4.8)  is 
well  defined. 

We  now  state  the  second  main  result  of  the  chapter,  which  establishes  the 
convergence  of  BLU  estimators  and  of  the  estimates,  as  n  ^  oo.  In  the  statement 
of  the  theorem,  Xyl>  denotes  the  BLLT  estimate  of  xu  in  the  finite  graph  <7(n) :  The 
proof  is  provided  in  Section  4.7. 
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Theorem  4.5.2  (BLUE  Convergence).  Consider  a  network  ( Q,P )  with  a 
single  reference  node  o  G  V  that  satisfies  Assumption  4.2.1.  For  every  node 
u  G  V  \  {o},  if  {G(n)}  is  a  nested  sequence  of  finite  graphs  that  satisfies  Assump¬ 
tion  4-5-1  with  u  and  o  belonging  to  every  graph  in  the  sequence,  the  following 
statements  hold. 

1.  The  sequence  of  BLU  estimates  {xlff }  converges  in  the  mean-square  sense. 

2.  The  sequence  of  BLU  estimators  {C^}  for  xu  converges  to  some  C  G  Tip. 

3.  The  sequence  of  BL  UE  estimation  error  covariance  matrices 

S<C>  :=  E[(i„  -  4”>)K  -  i<r')T] 

converges  to  the  effective  resistance  Refi0  between  u  and  o  in  the  electrical 
network  ( Q ,  P ),  i.e., 

lim  Eg  =  Rf 

n— >oo 

Moreover,  the  BLUE  covariances  decrease  monotonically,  in  the  sense  that 


Theorem  4.5.2  shows  that  under  the  bounded  degree  assumption,  by  using  only 
a  finite  number  of  measurements  among  the  infinitely  many  potentially  available, 
we  can  construct  estimates  whose  error  variance  is  arbitrarily  close  to  the  mini¬ 
mum  possible  variance  that  could  be  achieved  by  using  an  arbitrarily  large  number 
of  measurements.  In  addition,  the  estimates  themselves  converge  and  the  “lim¬ 
iting”  estimator  is  square-summable  in  the  sense  of  (4.8).  The  theorem  tells  us 
that  even  when  the  number  of  measurements  go  to  infinity,  the  limiting  BLUE 
covariance  does  not  go  to  0,  but  to  a  positive  definite  matrix. 
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Proofs  of  these  results  will  require  developing  additional  tools  by  exploiting 
the  analogy  between  generalized  electrical  networks  and  measurement  networks, 
which  is  done  in  Section  4.6.  We  briefly  discuss  how  fast  the  convergence  of 
to  R^a  takes  place. 

4.5.1  Convergence  rate 

Theorem  4.5.2  shows  that  the  BLU  estimator  error  variance  in  a  sequence  of 
nested  finite  subgraphs  of  an  infinite  measurement  graph  converges  to  a  limiting 
variance  that  is  numerically  equal  to  an  effective  resistance,  regardless  of  how  the 
sequence  G(n)  is  constructed.  However,  the  rate  at  which  the  covariances  si”o 
converge  to  the  effective  resistance  in  the  infinite  graph  will  depend  on  how  the 
sequence  {G(n)}  is  constructed  vis-a-vis  the  nodes  u  and  o.  One  natural  way  to 
construct  the  graph  G(n)  =  (T(n),  “£(«))  is  to  take  'Pqq  to  contain  all  nodes  that 
are  at  a  graphical  distance  smaller  than  a{n )  from  the  shortest  path  connecting 
u  and  o,  where  «(•)  is  a  positive  and  increasing  function.  The  distance  of  a  node 
from  a  path  denotes  the  minimum  graphical  distance  between  the  node  and  any 
node  lying  on  the  path.  If  there  are  multiple  shortest  paths,  we  take  the  union 
of  the  sets  obtained  for  each  of  the  shortest  paths.  tE(n)  is  then  chosen  as  the 
set  of  edges  that  are  incident  on  the  nodes  in  4'(n).  This  construction  satisfies 
Assumption  4.5.1. 

Figure  4.2 (a-c)  shows  the  first  three  members  of  a  sequence  of  nested  subgraphs 
{Z 2(n)}  of  the  2-dimensional  lattice  Z2,  constructed  according  to  the  procedure 
outlined  above,  with  a(n)  =  n.  For  simplicity,  we  consider  the  case  of  scalar 
variables  and  measurements,  and  every  measurement  error  is  assumed  to  have  a 
variance  1.  Covariances  for  vector-valued  variables  ( k  >  1)  could  be  obtained 
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(a)  Z2(i) 


(b)  Z2(2) 


(c)  Z2(3) 


(d)  Variance  of  in 

(Z2(n)>  1)  as  a  function  of  n  (shown 
in  circles),  and  the  value  of  the  ef¬ 
fective  resistance  between  u  and  o 
in  the  infinite  lattice  Z2  (shown  as  a 
dotted  line). 


(e)  The  ratio  ||si”]||/||Eu,0||  as  a 
function  of  f3  for  three  different  node 
pairs  u,  o  in  the  nested  sequence 
Z2(n)- 


Figure  4.2.  (a)-(c)The  first  three  members  of  a  sequence  of  nested  subgraphs  Z2(n) 
of  the  2-dimensional  lattice  Z2,  and  (d)  the  plot  of  variances  in  the  sequence 
of  measurement  networks  (Z2(„),  1)  as  a  function  of  n.  (e)  Trend  of  the  ratio  of 
variance  in  the  finite  subnetworks  (Z2(n),  1)  to  the  minimum  possible  variance  in 
(Z2, 1),  as  a  function  of  / 3(n )  for  three  different  node  pairs  u,  o. 

using  Lemma  4.6.1.  Figure  4.2(d)  shows  the  plot  of  the  variances  si”]  of  node 
u  in  the  measurement  network  (Z2(n),l)  as  a  function  of  n.  The  limiting  value 
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of  the  variance  is  the  effective  resistance  between  u  and  o  in  the  infinite  lattice 


Z2.  In  an  infinite  2-dimensional  lattice  with  unit  resistance  on  every  edge,  the 
effective  resistance  between  two  nodes  u  with  relative  x  and  y  coordinates  is 
given  by  R^0  =  ^(log  \J x2  +  y2  +  7  +  -(log  8),  where  7  ~  0.577  [105].  For  the 
example  in  Figure  4.2(a-c),  x  =  A,y  =  0,  so  the  limiting  variance  for  node  u  is 
^u,o  =  Reuo  ~  0.956,  which  is  shown  by  a  dotted  line  in  the  Figure  4.2(d).  As 
expected,  the  variances  si™o  monotonically  decrease  and  approach  the  asymptotic 
value  as  n  increases. 

For  a  given  nested  sequence  G(n),  the  convergence  rate  of  E*™)  to  E  will  depend 
on  the  graphical  distance  g%d  between  nodes  u  and  o.  Taking  this  into  account,  we 
can  construct  the  sequence  (7(n)  by  choosing  'Fpq  as  the  set  of  nodes  that  are  within 
a  graphical  distance  of  /3(n)dUt0  of  the  shortest  path  connecting  u  and  o,  where 
/?(•)  is  a  positive  and  increasing  function.  Numerical  studies  on  the  2-dimensional 
lattice  Z2  indicate  that  with  this  construction,  the  ratio  || 1| / 1| EMi0 ||  depends 
only  on  the  value  of  f3  and  is  independent  of  dUt0.  Figure  4.2(e)  shows  the  ratio 
||si"o||/||E.U)0||  as  a  function  of  (5  for  three  different  nodes  taken  at  distances  of  2, 
4  and  8,  respectively,  from  o.  The  figure  shows  that  the  rate  of  convergence  of 
E$  to  the  limiting  value  E.Ui0  is  not  sensitive  to  the  distance  between  u  and  o.  In 
particular,  with  (3  =  2,  the  error  between  si"]  and  T,Uj0  is  less  than  10%. 

These  studies  show  that  in  a  2-dimensional  lattice,  a  relatively  small  subgraph 
is  sufficient  to  obtain  an  estimate  whose  variance  is  quite  close  to  the  minimum 
possible  achievable  by  using  all  the  measurements.  For  an  arbitrary  measurement 
graph,  as  long  as  the  graph  is  “close  to”  a  lattice  in  an  appropriate  sense,  similar 
trends  are  expected.  Appropriate  measures  of  closeness  to  lattices  are  developed 
in  Chapter  5. 
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4.6  Properties  of  generalized  electrical  networks 


4.6.1  Rayleigh’s  monotonicity  law 

The  next  result  relates  the  effective  resistances  of  two  distinct  networks  re¬ 
lated  by  an  appropriate  partial  order.  A  similar  result  for  finite  scalar  networks, 
called  Rayleigh’s  Monotonicity  Law  [2],  states  that  if  the  edge-resistances  in  a 
scalar  electrical  network  are  increased  (perhaps  even  made  infinity,  i.e.,  an  open 
circuit),  then  the  effective  resistance  between  every  pair  of  nodes  in  the  network 
can  only  increase.  For  a  long  time,  Rayleigh’s  Monotonicity  Law  was  considered 
so  evidently  true  that  no  proof  was  deemed  necessary.  Nevertheless,  Doyle  and 
Snell  [2]  provided  a  proof,  which  we  now  extend  to  generalized  electrical  networks. 

To  present  the  result  in  full  generality,  we  introduce  the  notion  of  graph  em¬ 
bedding.  We  say  that  a  graph  Q  =  (£',  £)  can  be  embedded  in  another  graph 
g  =  (V,iE)  if  v  C  4/>,  and,  for  every  edge  between  two  nodes  in  Q,  there  is  a 
corresponding  edge  between  them  in  Q.  More  precisely,  Q  can  be  embedded  in  Q 

if 


1.  there  exists  an  injective  map  77  :  ‘V  — >  T7,  and 

2.  for  every  e  G  £,  there  exists  e  G  £  such  that,  if  e  ~  u,  v  then  e  rs_/  v{u)Mv)- 

In  other  words,  Q  is  a  subgraph  of  Q  when  they  are  thought  of  as  undirected.  In 
the  sequel,  we  use  Q  C  Q  to  denote  that  Q  can  be  embedded  in  Q .  In  addition, 
when  77  :  — >  ‘V’  is  the  embedding  of  Q  into  Q,  for  every  edge  e  G  £,  we  use  the 

somewhat  loose  notation  77(e)  to  denote  the  edge  in  £  that  corresponds  to  the 
edge  e. 
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Theorem  4.6.1  (Generalized  Rayleigh’s  Monotonicity  Law).  Consider  two 
generalized  electrical  networks  (Q,  R )  and  (Q,  R)  for  which  Q  can  be  embedded  in 
Q  with  an  embedding  function  rj  :  V  — >  V ,  i.e.,  Q  C  Q ,  and  Re  >  Rv(e)  for  every 
e6  £.  For  every  pair  of  nodes  u,  v  of  Q , 

Toeff  \  oeff 
—  riu,v'> 

where  Rffv  and  R*fv  are  the  effective  resistances  between  u  and  v  in  the  networks 
( G,R )  and(Q,R),  respectively.  □ 

Proof  of  Theorem  4-6.1.  Let  i  :  “E  — ■>  Mfcxfc  and  i  :  “E  — >  M.kxk  be  the  currents 
from  u  to  v  in  the  networks  (G,R)  and  (Q,R),  respectively,  both  with  intensity 
i  e  Denote  by  77 (£)  the  set  of  edges  in  “E  that  correspond  to  the  edges  in 

2h  Dehne  j  :  £  — »  18Lkxk  to  be  the  following  “extension”  of  the  current  i  to  the 
graph  G 

{V be)  e  e  Vi'S) 

0  ee£  \  77  (E) 

where  77” 1  (e)  represents  the  pre-image  of  e  in  the  set  “E,  thinking  of  77  as  a  mapping 
from  £  to  “E.  We  conclude  that  j  satisfies  Kirchoff’s  current  law  (4.3)  for  the 
network  (Q,  R)  and  is  therefore  a  flow  for  this  network  (although  not  necessarily  a 
current).  Since  according  to  Theorem  4.4.1  the  current  i  is  the  flow  of  minimum 
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dissipated  energy  for  the  network  (Q,R),  we  conclude  that 

trace(y ^  iTeReie)  <  trace(^  jjReje) 

eE!E  eE(E 

< trace (  E 

eert'E) 

=  trace  if  Reie ) 

eE!E 

<  trace  i^Reie), 

ee£ 

where  the  hrst  inequality  is  due  to  r/(£)  C  £  and  the  summation  involving  positive 
numbers,  the  equality  is  a  consequence  of  the  definition  of  j,  and  the  last  inequality 
follows  from  the  fact  that  Rri(e)  <  Re,  Ve  G  £.  From  this,  Lemma  4.4.2,  and  the 
definition  of  effective  resistance,  we  conclude  that 

traced  R^vi)  <  trace(ii 

for  every  i  G  Mfexfc,  from  which  the  result  follows.  ■ 

Remark  4.6.1  (Role  of  edge  directions) .  Effective  resistances  are  independent  of 
the  directions  of  the  edges  in  the  graph.  Reversing  the  direction  of  an  edge  e 
simply  reverses  the  sign  of  the  current  ie  on  that  edge.  It  follows  from  (4.6)  that 
the  effective  resistance  between  any  two  nodes  is  unaffected  by  the  edge-directions. 
Rayleigh’s  monotonicity  offers  further  evidence  of  this  fact.  Therefore,  from  now 
on  we  use  Q  C  Q  to  denote  that  Q  can  be  embedded  in  Q.  Note  that  the  graph 
partial  order  defined  in  Assumption  4.5.1  can  now  be  understood  to  mean  graph 
embedding;  the  results  of  Theorem  4.5.2  do  not  change  in  doing  so. 

It  follows  from  the  electrical  analogy  Theorem  4.5.1  that,  although  a  measure¬ 
ment  graph  is  directed  because  of  the  need  to  distinguish  between  a  measurement 
of  xu  —  xv  and  that  of  xv  —  xu,  the  BLLIE  covariances  are  independent  of  edge 
directions.  □ 
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4.6.2  Other  properties 


The  next  result  shows  that  if  the  scalar  effective  resistances  in  an  electrical 
network  with  1-Ohm  resistors  is  known,  then  the  generalized  effective  resistances 
upon  putting  a  single  matrix  resistance  on  every  edge  can  be  deduced  from  them. 
Its  proof  is  provided  in  Section  4.9. 

Lemma  4.6.1.  For  a  graph  Q  with  finite  maximum  node  degree,  let  rffv  denote 
the  scalar  effective  resistance  between  two  nodes  u  and  v  in  an  a  scalar  electrical 
network  (Q,  1)  that  has  1-Ohm  resistors  on  every  edge  of  the  graph  Q.  Let  (Q,  R0) 
be  a  generalized  electrical  network  constructed  from  the  same  graph  Q  by  assigning 
a  generalized  resistance  Ra  G  8fc+  to  every  edge  of  Q .  Then, 


peff  _  eff  p 

■n'u,w  '  u,vrt° 


□ 


It  turns  out  that  matrix-valued  effective  resistances  obey  the  triangle  inequal¬ 
ity.  It  is  known  that  scalar  effective  resistance  obeys  triangle  inequality,  and  is 
therefore  also  referred  to  as  the  “resistance  distance”  [106].  Although  the  result 
in  [106]  was  proved  only  for  finite  networks,  it  is  not  hard  to  extend  it  to  infi¬ 
nite  networks  as  well.  Application  of  Lemma  4.6.1  then  leads  to  the  following 
simple  extension  of  the  triangle  inequality  to  generalized  networks  with  constant 
resistances  on  every  edge. 


Lemma  4.6.2  (Triangle  Inequality).  Let  ( Q,R0 )  be  a  generalized  electrical 
network  satisfying  Assumption  f.2.1  with  a  constant  resistance  R0  G  E>k+  on  every 
edge  of  Q.  Then,  for  every  triple  of  nodes  u,v,w  in  the  network, 


neff 
-n'u,  w 


<  R' 


eff 


—  u.v 


r : 


eff 


□ 


The  next  result  states  that  we  can  replace  parallel  edges  by  a  single  edge  of  ap¬ 
propriate  resistance  so  that  the  effective  resistances  in  the  network  are  unchanged. 
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Recall  that  two  edges  e\  and  e 2  are  said  to  be  parallel  if  they  are  incident  on  the 
same  set  of  nodes  (irrespective  of  the  direction  of  the  edges).  A  similar  result  was 
stated  without  proof  for  finite  graphs  in  Section  2.2.3.  The  proof  of  the  proposition 
is  provided  in  Section  4.9. 

We  assume  that  the  edge  set  £  is  specified  as  £  =  {£1,  £2,  •  •  •  },  such  that  if 
e  G  £,■  for  some  j,  then  all  the  edges  parallel  to  e  also  belong  to  £r 

Proposition  4.6.1  (Parallel  Resistors).  Consider  a  generalized  electrical  net¬ 
work  ( Q,P )  satisfying  Assumption  4-2.1.  Let  £,•  C  £,j  =  1,...  be  subsets  of 
parallel  edges  as  described  above.  The  effective  resistance  between  every  pair  of 
nodes  remain  the  same  if  every  set  £,-  is  replaced  by  a  single  edge  ej  with  edge 
resistance  Rej  that  is  specified  by 


e£“Ej 


The  reader  may  notice  that  the  above  rule  is  simply  the  application  of  the 
parallel  resistance  formula  to  generalized  resistances.  The  proof  of  the  result  is 
provided  in  Section  4.9. 

4.6.3  Approximating  infinite  network  currents 

The  next  theorem  shows  that  currents  and  effective  resistances  in  an  infinite 
network  can  be  approximated  with  arbitrary  accuracy  by  those  in  a  sufficiently 
large  but  finite  subnetwork.  A  similar  result  for  the  usual  scalar  electrical  networks 
was  established  by  Flanders  [103,  104],  The  proof  of  the  theorem,  which  is  inspired 
by  [103],  is  provided  in  Section  4.9. 

Theorem  4.6.2  (Finite  Approximation).  Let  (G,R)  be  a  network  satisfying 
Assumption  4.2.1,  {G(n)\  a  nested  sequence  of  finite  graphs  satisfying  Assump- 
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tion  4-5-1,  and  u,v  two  arbitrary  nodes  that  appear  in  every  graph  Q(n)-  For  an 
arbitrary  i  G  Mfcxfc,  let  i  and  i ^  denote  the  currents  from  node  u  to  node  v  in 
the  infinite  network  ( Q,R )  and  in  the  finite  network  (Q^npR),  respectively,  with 
intensity  i.  Then, 

lim  i ^  =  i, 

n— xx) 

where  convergence  is  in  the  Tin-norm.  In  addition,  denoting  by  Rffv  and  Rfa.v^  the 
effective  resistances  between  nodes  u  and  v  in  the  networks  ( Q,R )  and  (Q(n),R), 
respectively,  we  have 


lim  <„<”>  = 

n— >oo 


□ 


This  result  will  be  instrumental  in  showing  that  the  BLU  estimator  error 
covariances  in  large  finite  networks  converges  to  the  effective  resistance  in  the 
limiting  infinite  network. 


4.6.4  Electrical  analogy  for  finite  networks 

In  a  finite  measurement  network,  the  BLUE  covariance  of  a  node  variable 
xu  is  the  same  as  the  generalized  effective  resistance  between  u  and  o  in  the 
corresponding  electrical  network,  which  is  stated  in  the  next  theorem. 

Theorem  4.6.3  (Finite  Electrical  Analogy).  Let  ( Q,P )  be  a  measurement 
network  with  a  finite  weakly  connected  graph  Q  =  (if ,  £)  and  an  edge- covariance 
function  P  :  “E  — >  §fc+,  with  node  o  as  the  reference  node.  For  every  node  u  G 
{'E  \  o},  the  following  statements  hold. 

1.  The  BLU  estimator  C  of  xu  in  the  finite  measurement  network  ( Q,P )  is 
equal  to  the  current  i  with  identity  intensity  R  in  the  generalized  electrical 
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network  (Q,  P )  from  u  to  o: 


C  =  i. 


2.  The  covariance  EU)0  of  the  BLU  estimation  error  xu  —  xu  is  equal  to  the 
effective  resistance  Refi0  between  the  node  u  and  the  reference  node  o: 


E,,0  =  R 


eff 


□ 


It  was  shown  in  Section  2.2.1  that  the  BLU  covariance  of  the  vector  of  all  node 
variables  is  given  by  the  inverse  of  the  Dirichlet  Laplacian  L  (see  Theorem  2.2.1). 
It  follows  from  the  result  above  that  the  effective  resistances  in  a  finite  network 
are  the  k  x  k  blocks  on  the  diagonal  of  L~1.  This  is  stated  formally  in  the  next 
corollary. 

Corollary  4.6.1.  Consider  a  finite  generalized  electrical  network  ( G,R )  satisfy¬ 
ing  Assumption  f.2.1  where  Q  =  (l/,  £)  consists  of  n  nodes,  of  which  one  is  a 
reference  node.  Let  the  reference  node  be  numbered  as  the  nth  node  and  the  others 
be  numbered  as  1 , ,n  —  1.  Then,  the  effective  resistance  between  the  ?iode  with 
index  u  and  the  reference  node  n  is  given  by 


R^n  =  (4>Z  ®  h)L  \(j)u  <g>  Ik) 


where  <fu  G  1  has  all  zeros  except  an  1  at  the  uth  location,  and  L  is  the  Dirichlet 
Laplacian  of  Q  w.r.t.  the  boundary  {n}  and  edge-weights  Rf1.  □ 


Proof  of  Theorem  4-6.3.  From  Proposition  2.2.1  on  the  characterization  of  opti¬ 
mal  estimators  of  node  variables  in  finite  graphs  and  the  definition  of  energy 
dissipation  (4.5),  we  see  that  in  a  finite  network  (Q,  P)  with  reference  node  o,  the 
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BLU  estimator  C  of  node  variable  xu  is  given  by 


C  =  arg  min  ||j|| 

subject  to:  j  is  a  flow  of  intensity  /*,  from  u  to  o. 


Comparing  with  the  electrical  network  problem,  we  conclude  from  Theorem  4.4.1 
that  the  BLU  estimator  C  of  xu  is  the  current  %  of  intensity  Ik  from  u  to  o  in  the 
generalized  electrical  network  ( Q,P ),  which  proves  the  first  statement. 


Since  C  —  i,  it  follows  from  Unbiased  Estimator  Lemma  2.2.2  that  the  covariance 
of  Xu  s  BLLI  estimation  error  is  given  by 


y  —  V  iT  P  i  —  ReS 

^11,0  /  L  1  e^e  1Lu,oi 


ee£ 


where  the  second  inequality  follows  from  (4.6),  which  proves  the  second  statement. 


4.7  Proof  of  the  main  results 

First  we  prove  the  BLLIE  convergence  Theorem  4.5.2,  using  the  tools  developed 
so  far. 

Proof  of  Theorem  4.5.2.  We  will  prove  the  statements  of  the  theorem  in  reverse 
order. 

Since  the  sequence  of  BLU  covariances  Eino  is  the  same  as  the  sequence  of  effective 
resistances  (  Finite  Electrical  Analogy  Theorem  4.6.3),  and  the  sequence 

converges  to  the  effective  resistance  in  the  infinite  network  (Finite 
Approximation  Theorem  4.6.2),  we  have 

y(n)  _  neff(n)  .  peff 
^ u,o  ±Lu,o  1Lu,o * 


140 


Moreover,  by  the  construction  of  the  nested  sequence  { Q if  n \  <  ri2 ,  then 
gM  q  g(n2\  and  so  by  Rayleigh’s  monotonicity  law  (Theorem  4.6.1), 


E(1)  >  S(2)  > 

u.o  —  u.o  — 


5 


from  which  the  third  statement  of  the  theorem  follows. 

The  BLU  estimator  C ^  of  xu  in  the  finite  network  (G{n)i  P)  is  equal  to  the  current 
iSn>  in  the  generalized  electrical  network  (G(n),P)  (Finite  Electrical  Analogy  The¬ 
orem  4.6.3),  and  the  currents  i ^  converge  to  the  unique  current  i  in  the  electrical 
network  (G,R)  (Finite  Approximation  Theorem  4.6.2).  Therefore 

C(n)  =  j(n) 


where  the  convergence  is  in  the  Tip- norm.  This  proves  the  second  statement. 


By  definition  of  the  BLU  estimator,  we  get 

Xu)=  Y  C^T(xp-xq  +  ePtq) 

(p,q)££(n'> 

=  xu+  V  C«%,„  (4.9) 

(p,g)e£(n) 

where  the  second  equality  follows  from  unbiasedness,  since  otherwise  the  expecta¬ 
tion  of  the  left  hand  side  would  not  be  equal  to  xu.  Let  n  <  l,  so  that  from  Nested 
Sequence  Assumption  4.5.1,  G(n)  C  G(i)-  It  follows  from  the  uncorrelated- ness  of 
the  e’s  and  (4.9)  that 

Ep®  -^’P?  -i<n|)T]  =  V  (C®  -  C<->)TPe(C<n  - 

where  we  have  used  the  convention  that  Cel)  =  0  if  e  G  IE®  \  <E^n\  This  leads  to 
trace  ( Ep®  -  4">)(i®  -  4">)T])  =  ||C®  -  C(”>  ||2, 
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where  ||  •  ||  is  the  Tip- norm.  Since  — >  i,  ||C®  —  (7^||  — >  0  as  n,  m  — >  cx). 

Therefore, 

lim  sup  trace  (E[(x®  —  —  x^)T])  =  0.  (4-10) 

n^°°  l>n 

We  recall  that  a  sequence  of  random  variables  {r)n}  converges  in  the  mean  square 
sense  if  and  only  if  (proposition  6.3  in  [107]) 

lim  sup  E[  \rji  -  rjn |2]  =0. 

n,l  >oo 

Therefore,  the  sequence  of  random  vectors  x ^  converge  entry-wise  in  the  mean 
square  sense.  This  proves  the  first  statement  and  completes  the  proof.  ■ 

It  was  shown  in  the  previous  section  that  for  finite  measurement  networks,  the 
effective  resistance  is  the  same  as  the  BLUE  covariance.  The  Electrical  Analogy 
Theorem  4.5.1  is  an  extension  of  this  analogy  to  infinite  networks,  which  follows 
as  a  consequence  of  Theorem  4.5.2. 

Proof  of  Theorem  4-5.1.  When  the  graph  Q  is  finite,  the  statement  of  the  theorem 
follows  from  Finite  Electrical  Analogy  Theorem  4.6.3. 

When  Q  is  infinite,  consider  a  sequence  {G(n)}  °f  nested  finite  subgraphs  of  the 
infinite  graph  Q  that  satisfies  the  Nested  Sequence  Assumption  4.5.1.  We  know 
from  the  BLLIE  convergence  Theorem  4.5.2  that  the  sequence  of  BLLT  estimation 
error  co- variance  matrices  converges  monotonically  to  the  effective  resistance 

R°T  ‘-e-. 

SgJ  >£$>...,  and  lim  Eg  =  JJ* 

n— >  oo 

It  is  straightforward  to  show  that  Rc^0  =  inf{si”o,  n  G  N}  according  to  the  matrix 
inhmum  definition  (4.2).  That  is,  Reff0  <  E^n^  Vn  G  N,  and  for  every  e  >  0,  3 n 
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such  that  R^0  +  elk  >  E[ni .  Now  we  will  show  that  R^0  is  also  the  infimum  of 
the  set 

S  •  {Euo((/finfie) ,  (/finite  G  (/ ,  (/finite  IS  finite}. 

If  ^finite  is  an  arbitrary  finite  subgraph  of  the  infinite  graph  (/,  then  3 n  G  N  such 
that  (/finite  C  Q(n).  By  the  Finite  Electrical  Analogy  Theorem  4.6.3  and  Rayleigh’s 
monotonicity  law  Theorem  4.6.1,  we  have  T^)0  <  SU)0  ((/finite)-  Since  R^0  <  E^o, 
we  have  that  is  a  lower  bound  of  the  set  S  defined  above. 

To  show  that  is  the  largest  lower  bound,  pick  an  e  >  0  and  pick  rn  G  N 
such  that  Re^a  +  elk  >  .  Such  an  m  exists  since  is  the  infimum  of 

{Eln]}.  Now  pick  any  finite  subgraph  (/finite  of  Q  such  that  (/finite  D  Q(mK  From 
the  electrical  analogy  for  finite  networks  and  Rayleigh’s  Monotonicity  Law,  we 
have  >  E,i0((/finite).  Thus,  for  every  e  >  0,  we  can  find  a  finite  subgraph 
(/finite  °f  Q  such  that  +  e/fc  >  E,i0((/finite).  We  therefore  have  =  inf  S'. 

This  proves  that  the  infimum  EU)0  of  (4.1)  is  well  defined,  and  is  equal  to  the 
effective  resistance  R^0,  which  concludes  the  proof.  ■ 


4.8  Comments  and  open  problems 

In  this  chapter  we  provided  a  formal  justification  for  using  infinite  graphs  as 
valid  approximation  of  large  but  finite  graphs,  and  established  conditions  under 
which  this  approximation  is  valid.  For  example,  the  condition  of  bounded  de¬ 
gree  of  the  graph  (see  Assumption  4.2.1)  is  seen  to  be  important  in  the  proof  of 
convergence  results. 

On  the  other  hand,  certain  important  classes  of  graphs,  such  as  scale  free 
graphs  [108]  and  random  geometric  graphs  [109],  have  unbounded  degree.  In  a 
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random  geometric  graph  of  n  nodes,  the  average  degree  of  a  node  has  to  be  fl(logn) 
to  ensure  connectivity  with  high  probability  [109].  Scale- free  graphs  exhibit  a 
heavy-tailed  degree  distribution  so  that  some  nodes  have  very  large  degrees  with 
very  small  probability.  As  a  result,  it  is  not  possible  -  at  least  within  the  confines 
of  the  techniques  used  here  -  to  study  these  graphs  by  examining  the  behavior 
of  the  limiting  infinite  graph.  It  is  unclear  what  will  it  take  to  study  effective 
resistance  in  infinite  graphs  with  unbounded  degree. 

The  BLUE  convergence  Theorem  4.5.2  points  to  an  interesting  direction  for 
designing  distributed  algorithms.  The  theorem  shows  that  when  an  infinite  num¬ 
ber  of  measurements  are  available,  the  estimate  of  a  node  variable  based  on  a 
finite  subset  of  the  measurements  can  be  arbitrarily  close  to  the  estimate  that  can 
be  obtained  by  using  all  the  available  measurements.  This  suggests  that  using 
small  subsets  of  available  measurements  can  give  us  estimates  that  are  quite  close 
to  the  optimal.  How  to  design  distributed  algorithms  to  take  advantage  of  this 
feature  is  an  open  question. 


4.9  Technical  proofs 


We  first  introduce  some  terminology.  Define  a  norm  for  all  node-functions 
u  :  V  — >  JS,kxk  as 


M  = 


^  trace  (o;^o;u) 


(4,11) 


where  ||  •  ||p  denotes  the  Frobenius  norm  of  a  matrix,  and  a  linear  vector  space 
Sp  as  the  space  of  all  bounded  node- functions  with  respect  to  the  above  defined 
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norm: 


Sp  =  {cv:H^Rkxk  |  |H|<oo}.  (4.12) 

For  an  infinite  network  (Q,  R ),  we  introduce  the  incidence  operator  srf  :  Hr  — >  S<p, 
which  is  defined  by  the  transformation: 

(^j)u  =  ^2au,eje,  3  e  Hr,  (4.13) 

eSE 

where  aU)£  is  nonzero  if  and  only  if  the  edge  e  is  incident  on  the  node  u  and  when 
nonzero,  aUje  =  —1  if  the  edge  e  is  directed  towards  u  and  aUfi  =  1  otherwise.  The 
incidence  operator  srf  is  simply  an  extension  to  infinite  graphs  of  the  generalized 
incidence  matrix  defined  in  Section  2.2.1  [see  (2.3)]  for  finite  graphs.  The  series 
in  (4.13)  is  absolutely  convergent  since  it  involves  only  a  finite  number  of  terms 
due  to  the  bounded  degree  of  Q. 

We  call  a  node-function  u  G  S<p  a  divergence  for  the  graph  Q  if  u;  has  finite 
support  and  YlueV  uju  =  0.  One  can  view  a  divergence  as  an  assignment  of  flow 
sources  at  a  finite  number  of  nodes  of  the  graph  so  that  total  flow  into  the  graph 
is  equal  to  the  total  flow  out  of  it. 

An  edge-function  j  e  Hr  is  called  a  flow  in  Q  with  divergence  u  €  if  u  is 
a  divergence  in  Q  and  j  satisfies 

ju,v~  3v,u=Uu,  VueH.  (4.14) 

(v,n)G(£ 

u=u  u=u 

The  condition  (4.14)  can  be  compactly  represented  as 

sd j  =  u.  (4.15) 

An  edge-function  j  e  Hr  is  called  a  circulation  in  (G,  R)  if 

si  j  =  0.  (4.16) 


145 


In  other  words,  a  circulation  is  an  element  of  Hr  that  belongs  to  A f  (■£?),  the  null 
space  of  srf . 

First  we  show  that  the  linear  operator  srf  :  Hr  — >  Sy  defined  above  is  bounded. 
Since  for  each  u  G  V,  (A j)u  G  Rkxk,  we  have 

||(^j)n||F=  II  y~l  duejeW'F  T  ^  ||je||F 

eG^u  eG(Eu 

where  is  the  set  edges  in  “E  that  are  incident  on  u.  It  can  be  shown  from 
the  relationship  between  the  Frobenius  norm  and  the  singular  values  of  a  matrix 
that  for  every  edge  e  6  £,  we  have  \\je\\2F  <  y-^— trace  (jJ/Eje),  where  Amin  is  the 
uniform  lower  bound  on  the  smallest  eigenvalue  of  Re,  Ve  G  £.  Existence  of  a 
positive  Amin  is  guaranteed  by  Assumption  4.2.1.  Since  the  above  is  true  for  every 
uG  from  (4.11)  we  get 

iKiii2  =  Eii(^j)-ii  isd-EE  trace  (jj  Reje) 

^min 

uG^  ue^ee'Eu 

<^^traceariye)=^IUH2, 

/'min  /'min 

eG£ 

where  dmax  is  the  largest  degree  of  the  nodes  of  the  graph  (/,  which  is  finite  by 
Assumption  4.2.1.  It  follows  that 

IKII  <  M 

V  /'min 

which  shows  that  srf  is  bounded. 

Now  we  are  ready  to  prove  the  Infinite  Current  Theorem  4.4.1. 

Proof  of  Theorem  f.f.l.  We  first  prove  that  among  the  flows  in  Hr  that  are  limits 
of  finite  support  flows,  the  flow  with  the  minimum  dissipated  energy  exists  and  is 
unique,  and  that  this  flow  is  a  current.  Then  we  show  that  there  can  be  only  one 
such  current. 
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For  a  flow  of  intensity  j  that  is  injected  at  u  and  extracted  at  v,  the  corresponding 
divergence  Co  is  given  by  Cou  =  j,  Cov  =  — j  and  Cop  =  0  for  all  p  e  V  \  {u,  n}.  Pick 
a  path  V  from  u  to  v,  and  construct  a  flow  jpath  of  intensity  j  from  u  to  v  along 
V  as  follows: 


j 


•path 

Je 


=  < 


e  G  V,  e  =  V 
e  G  V,  e  ^  V 
e  £  V 


It  is  easy  to  see  that  j  is  a  hnite  support  edge-function  in  Hr  that  satisfies  the 
constraint  equation  srf j  =  Co.  All  flows  satisfying  this  constraint  lie  in  the  linear 
variety  j'Path  +  where  A is  the  null  space  of  srf .  Since  srf  is  a  bounded 

linear  operator,  its  null  space  is  closed.  As  a  result,  J\f(g/),  which  is  the  space  of  all 
circulations,  is  a  Hilbert  space.  Consider  the  subspace  of  J\f(&/)  that  consists  of  all 
hnite  support  circulations,  and  denote  it  by  (“F”  for  hnite  support).  Its 

closure  A fp{s^)  is  a  closed  subspace  of  the  Hilbert  space  A By  the  Projection 
Theorem  applied  to  linear  varieties  (Theorem  1  in  section  3.10  of  [62]),  there  exists 
a  unique  edge- function  in  jpath  +  A /"f(^/)  of  minimum  norm,  which  we  call  i,  and 
which  is  orthogonal  to  A/"f(^/). 


Since  i  —  j path  G  A/p (•£/),  there  exists  a  sequence  of  hnite  support  circulations 
c(n)  such  that  c ^  — >  (i  —  Jpath),  where  the  convergence  is  in  Hr  norm.  Define 
j(n)  jpath  _j_  c(n)^  go  by  construction,  each  j ^  is  a  hnite  support  how  of 
intensity  i  from  u  to  v,  and  j ^  — >  i  in  Hr.  This  establishes  the  existence  and 
uniqueness  of  the  how  with  minimum  power  dissipation  that  is  the  limit  of  a 
sequence  of  hnite  support  hows. 
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Since  i  is  orthogonal  to  A 


(i,  c)  =  0 


(4.17) 


for  every  every  c  G  Declare  the  generalized  potential  drop  across  an  edge 

e  as  Reie  to  satisfy  Ohm’s  law.  If  the  graph  has  no  loops,  Kirchoff’s  loop  law  is 
trivially  satisfied  by  these  generalized  potential  drops.  If  the  graph  has  loops,  pick 


0  =  (i,c*)  =  trac  e(i^Rec*e) 

esc 

=  ^2  fe  trace(if  ReJ)  =  ^  fe  trace(  JT Reie) 

eSC  eSC 

=  trace  JT(^  feReQ 
esc 

Since  this  is  true  for  arbitrary  J,  we  must  have 

J2ife(Reie)]  =  0,  (4-18) 

e£C 

which  in  turn  must  be  true  for  every  loop  C,  since  the  arguments  above  can  be 
repeated  for  every  loop.  Eq.  (4.18)  therefore  shows  that  the  net  potential  drop 
along  every  loop  is  0.  In  other  words,  the  generalized  potential  drops  determined 
by  i  in  accordance  with  Ohm’s  law  satisfies  Kirchoff’s  loop  law.  Construction  of  a 
generalized  node  potential  function  V  is  now  trivial.  Therefore  i  is  a  generalized 
current. 
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To  prove  uniqueness  of  the  current,  let  i  and  i  be  two  currents  from  u  to  v  with 
intensity  i.  Define  an  edge-function  d  :  £  — »  Mfcxfc  as  de  :=  ie  —  ie.  We  see  that 
d  G  A From  linearity  of  the  inner  product, 

(d,  d)  =  (i  —  i,i  —  i)  =  ( i ,  d)  —  (i,  d)  —  0  —  0, 

where  the  last  equalities  follows  from  (4.17),  since  by  construction,  both  i  and  i 
are  currents.  It  follows  that 

y^trac e(d^Rede)  =  0  de  =  0  Ve  G  “E, 

since  Re  >  0  for  all  edges  eG  £.  We  therefore  conclude  that  i  =  i,  which  proves 
that  the  current  i  is  unique. 

To  examine  the  uniqueness  of  potentials,  suppose  that  V  and  V  are  two  potentials 
associated  with  the  same  current.  Because  of  Ohm’s  Law,  we  conclude  that 

Vu  -  Vv  =  Vu  -  Vv  =>  Du  =  Dv,\/(u ,  v)  G  £, 

where  D  =  V  —  V.  Since  Q  is  connected,  D  must  be  a  constant,  but  is  otherwise 
arbitrary.  This  shows  that  the  node  potentials  are  unique  up  to  an  additive 
constant. 

If  i  is  a  current  with  intensity  i  and  i  is  a  current  with  intensity  i,  both  from  u 
to  v.  it  can  be  shown  in  a  straightforward  manner  that  ai  +  /3i  is  also  a  current 
with  intensity  ai  +  /3 i  from  u  to  v,  from  which  the  linearity  from  i  to  %  follows.  A 
similar  linearity  proof  also  holds  for  the  potential  differences.  ■ 

The  corollary  presented  next  is  essentially  a  repetition  of  (4.17),  but  is  restated 
because  of  its  usefulness  in  several  subsequent  proofs. 
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Corollary  4.9.1.  A  flow  i  is  the  generalized  current  in  the  network  (Q,R)  if  and 
only  if 

(h  c)  =  0 

for  every  circulation  c  G  Mf^)-  □ 

Next  we  prove  that  the  linear  mapping  between  intensity  and  voltage  drop 
between  the  source  node  and  sink  node  is  given  by  a  k  x  k  matrix. 

Proof  of  Lemma  f.f.l.  For  the  current  with  intensity  i  flowing  from  u  to  v,  we 
define  a  divergence  u  as 

Up  =  0  Vp  G  *1/  \  {u,  v},  uu  —  i,  uv  —  —  i. 

The  flow  constraint  now  becomes  Aj  =  u.  The  current  i  is  the  flow  that  satisfies 
this  constraint  and  minimizes  the  energy  dissipation  ^eg£  trace (jjReje),  as  shown 
in  Theorem  4.4.1.  For  every  node  p  G  T'*,  the  flow  constraint  becomes 

(*4j)p  =  Up  =£■  ^  '  ®p,eje  =  OJp.  (4.19) 

eS  “Ep 

Recognizing  that  this  is  a  k  x  k  matrix  equation,  we  express  it  as  k  separate  vector 
equations: 

^  ^  ^ p,li  l  1,  .  .  .  ,  /c, 

eS  “Ep 

where  the  second  subscript  l  represents  the  /th  column  of  the  corresponding  matrix. 
It  is  easy  to  see  that,  for  every  /,  the  constraints  on  the  Zth  column  of  jfl s  depend 
only  on  the  /th  column  of  up ,  and  therefore  on  the  Ith  column  of  i.  As  a  result,  the 
solution  to  this  optimization  problem  is  equivalent  to  solving  k  separate  problems 
“minimize  fli-^-ejed  subject  to  =  u” ,  for  l  =  1, . . . ,  k ,  where  the  edge 
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function  ji  and  the  node  function  u>i  are  now  vector- valued:  ji  :  £  — >  Rk,  toi  :  V  — > 
the  spaces  Rr  and  Sy  are  appropriately  redehned,  and  the  incidence  operator 
A  has  the  same  definition  as  in  (4.13)  with  respect  to  the  new  spaces  Hr^S^. 
Because  of  column- wise  independence  of  the  current  on  the  intensities,  the  matrix 
current  on  every  edges  is  obtained  by  stacking  the  k  vector-valued  currents  on 
that  edge  as  columns.  For  every  vector- valued  current  intensity  i 
we  obtain  a  corresponding  vector- valued  potential  difference  Vu j  —  Vvj .  Again, 
the  matrix-valued  potential  difference  Vu  —  Vv  resulting  from  the  original  problem 
consists  of  the  k  columns  that  are  the  vector-valued  potential  difference  Vuj  —  Vvj 
resulting  from  the  k  separate  optimization  problems  described  above. 

These  k  separate  optimization  problems  can  be  solved  to  determine  the  vector- 
valued  edge  currents  in  the  same  manner  that  the  single  optimization  problem  was 
solved  in  the  proof  of  Theorem  4.4.1  to  determine  the  matrix  valued  edge  currents. 
In  fact,  only  one  of  these  k  problems  needs  to  be  solved.  To  understand  why,  we 
first  note  that  the  linearity  between  the  matrix  valued  quantities  i  and  Vu  —  Vv 
that  was  established  in  Theorem  4.4.1  will  be  retained  between  the  corresponding 
vector-valued  quantities.  Specifically,  when  a  vector-valued  current  %i  flows  from 
u  to  v  with  vector  intensity  i/ ,  the  vector- valued  voltage  drop  Vuj  —  VV)i  will  be  a 
linear  function  of  the  vector  intensity  i/,  which  will  be  in  general  a  k  x  k  matrix. 
Let  Re^  v  G  be  this  matrix.  Then, 

Vu,i  -  Vv>i  =  R^v b,  Vi;  e  Mfc.  (4.20) 

From  linearity,  the  same  is  true  for  every  l  =  1, . . . ,  k.  Stacking  together  the  k 
columns  in  (4.20),  for  l  =  1, . . . ,  k,  we  get  Vu  —  Vv  —  R^v i,  which  proves  that 
the  linear  mapping  between  matrix  intensity  i  and  matrix-valued  potential  drop 
14  —  Vv  is  the  k  x  k  matrix  Rc^v.  ■ 
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Proof  of  Lemma  f.f.2.  Pick  a  path  V  from  u  to  v,  and  construct  a  flow  jpath  of 
intensity  j  from  u  to  v  along  V  as  follows: 


j 


JePath  =  { 


“j 


o 


e  eV,  e  —  P 
e  G  V, 
e(£V 


The  assumed  properties  of  j  imply  that  j  G  jpath  +J\fF(A).  Let  j ^  be  a  sequence 
of  finite  support  flows  in  ( Q ,  R )  that  converge  to  the  flow  j,  i.e.,  j ^  — >  j  in  HR. 
Define 


c;=  ]  -  J 


path 


c(n)  ._  j(n )  _  •  path 


The  function  c  G  7 iR  is  a  circulation  since  it  is  the  difference  between  two  flows 
of  the  same  intensity  between  the  same  two  nodes.  Moreover,  {c<'?4}  is  a  sequence 
of  finite-support  circulations  that  converge  to  c  in  Now,  since  is  a  finite 
support  circulation,  from  Corollary  4.9.1,  (i,c^)  =  (i,jpath  —  j^)  =  0  for  every 
n,  and  therefore 

lim  <;,/**» 

n— xx) 

Using  linearity  and  continuity  of  the  inner  product,  we  therefore  conclude  that 
lim  (i,jW)  =  (i,  jpath>  =>  (i,j)  =  (z,  jpath> 

n— xx 

=>-  ^trac e(iTeReje)  =  J^trac e((f?eie)Tjpath)  =  trace((Uu  -  K)Tj)  (4.21) 

eS £  eSP 

Since  i,  j  G  7iRl  denoting  the  sth  column  of  ie  by  is^e  and  the  tth  column  of  je  by 
jte:  we  can  show  from  (4.21)  using  straightforward  algebraic  manipulation  that 

OO 

Qs,t  ■=  ^  'iTs,eRejt,e  <  OO.  Vs,  t  =  {1,  .  .  .  ,  &}, 

e=l 
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and  that  the  series  converges  absolutely  for  every  s  and  t.  Define  the  matrix  Q 
by  [Q\s,t  —  Qs,t ■  Since  the  series  converges,  for  every  e  >  0,  we  can  choose  N  large 
enough  such  that 

N 

W^ilReje-QW  <  e, 

e=l 

where  ||  ■  ||  represents  any  matrix  norm.  We  thus  conclude  that  since  i,j  E  ' Hr , 
the  series  ^eg£  i^Reje  converges  absolutely  to  a  k  x  k  matrix.  Since  (4.21)  holds 
for  an  arbitrary  j,  it  can  be  shown  in  a  straightforward  manner  that  the  series 
)Cee£  *1  R-eje  must  converge  to  ( Vu  —  Vv)T j.  Therefore  we  get  the  desired  result 

J2leReJe  =  (Vu-Vv)Ty  ■ 

eSE 

We  now  prove  the  Finite  Approximation  Theorem  4.6.2. 

Proof  of  Theorem  4-6.2.  For  every  e  >  0,  we  can  find  a  finite-support  flow  jbd 
from  u  to  v  of  intensity  i  such  that 

\\i~j{n)\\  <e,  (4.22) 

which  follows  from  the  characterization  of  the  current  i  in  Theorem  4.4.1.  Pick 
a  finite  subgraph  Q ^  £ bd)  of  Q  from  the  nested  sequence  { Q such 

that  the  support  of  jbd  lies  in  tA'd  (i.e.,  the  edges  on  which  jbd  js  not  zero  are 
in  £bd).  Note  that  by  construction  u,v  G  Denoting  by  the  current  in 

{Q(n\  R),  it  follows  from  Corollary  4.9.1  that  for  a  circulation  <4n)  whose  support 
lies  in  Q^n\ 

(i(n),c(n))  =  0,  and  (i,c(n))  =  0. 

=>  |{i- J(n),c(n>)|  =  |(i,z(n))  -  0'(”),c(”>)| 
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Pick  c(n)  =  i ^  —  j^n\  which,  being  a  difference  of  two  finite  support  flows  from 
u  to  v  with  the  same  intensity,  is  a  finite  support  circulation.  Furthermore,  its 
support  lies  in  Q ^  since  both  i ^  and  j ^  have  their  support  in  Q^n\  For  this 
choice  of  c ^  in  the  equation  above,  we  get 

\(i-j{n),i{n)  -j{n))\  =  ||i(n)  -  j(n)||2 

||jM  _  j(n )  || 2  <  ||j  _  j(")  ||  ||jM  _  j(n)  ||  ^ 
from  the  Cauchy  Schwarz  inequality.  Therefore, 

||i(n)  -  j(n)||  <  ||i  —  j(n)  ||  <  e, 
from  (4.22).  From  the  triangle  inequality,  we  now  get 

||i-i(n)||  <  ||i  —  j(n) ||  +  \\i{n)  -j{n)\\  <  2e, 
which  proves  the  statement  that  i ^  — >  i  in  Hr. 


To  prove  the  convergence  of  the  effective  resistances,  pick  an  arbitrary  i  G  Mfcxfe 
and  let  i  and  i ^  be  the  currents  with  intensity  i  from  u  to  v  in  ( Q ,  R )  and  ,  R ), 
respectively.  It  follows  from  Lemma  4.4.2  that 


V  » =  v  =  iTK*<n)i. 

ee£  ee£(n) 

eE£ 

where  the  last  equality  uses  the  fact  that  is  a  flow  in  Q  with  intensity  i  (though 
not  a  current).  Therefore, 

5>ace  ((t  -  it])TRSe  -  !<”>))  =  trace  -  RMJi) 


ee£ 


Since  i  — >  in  Hr,  the  left  hand  side  goes  to  0  as  n  — >  oo.  Since  this  is  true  for 

arbitrary  i,  R^v^  — *  R*fv-  ■ 
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Proof  of  Lemma  4-6.1  ■  Let  is  :  *E  — >  M  be  the  scalar  current  in  (Q,  1)  of  unit 
scalar  intensity  from  u  to  v.  It  follows  from  (4.6)  that 


eE(E 


We  first  claim  that  the  matrix  current  i  in  ( Q ,  R0 )  of  intensity  R  from  u  to  v  is 
given  by  isR. 

To  prove  this  claim,  let  c  G  Nf(A).  Since  i  G  Hr  trivially, 

( i,c )  =  trace  (if  R0ce)  =  ise  trac  e(R0ce) 

eE!E  eE!E 

k 

=  ^  trace(ce)  =  5Z  Ce’l) 

eE!E  eE!E  1=1 

where  ce  :=  R0ce  G  Mfcxfc  for  every  eG  £,  and  Ce’m')  G  M  represents  the  (/,  m)th 
scalar  entry  of  the  matrix  ce.  Hence, 

(i,  c)  =  J2J2  &P  =  E  E  =  X>*.  cW)}  (4.23) 

ee£  «=1  «=1  ee£  /=1 

where  the  k  inner  products  on  the  right  hand  side  are  evaluated  in  the  space  74 1 
defined  for  the  scalar  network  (Q,  1).  Since  c  is  a  circulation,  Ac  =  0.  Therefore 

^  ^  CXp  eCe  7?o  ^  ^  ®p,eCe  0,  Vp  G  4^, 

6G£p  6G£p 

where  “Ep  is  the  set  of  edges  in  Q  that  are  incident  on  p,  which  shows  that  c  is 
also  a  circulation.  Clearly,  each  scalar  valued  edge  function  :  “E  — >  M  is  also 
a  circulation  for  the  scalar  electrical  network  ( Q ,  1).  It  follows  that  (is,Pl'l>)  =  0 
for  each  /  =  1 , ,k.  Hence,  (4.23)  implies 

(*,  c)=0 

for  every  c  G  Af(A),  which  is  precisely  the  characterization  of  the  current  in  74 r 
stated  in  Corollary  4.9.1.  This  proves  onr  claim  that  PR  is  the  current  in  ( Q ,  R0). 
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Therefore,  the  effective  resistance  is  given  by 

K  =  =  s-Ei'S1  = 

e££  eGE 

because  of  (4.6),  which  completes  the  proof.  ■ 

Proof  of  Proposition  4-6.1  ■  Pick  an  arbitrary  j.  Without  loss  of  generality,  as¬ 
sume  that  all  the  edges  in  Ej  are  directed  in  the  same  way  (cf.  Remark  4.6.1). 
Construct  a  network  (Q^R'),  where  Q'  —  {fV ,  E'),  such  that  every  set  Ej  C  E  is 
replaced  by  a  single  edge  ej  G  E',  and  the  edges  resistances  are  assigned  as 

R-J  :=  ^  Rp.  (4.24) 

eSEj 

The  orientation  of  is  taken  as  that  of  the  edges  in  Ej.  That  is,  if  the  edges  in 
El  are  incident  on  u  and  v  and  directed  from  u  to  v ,  then  ej  :=  (u,v). 

Let  i  :  E  — >  1BLkxk  be  the  current  in  the  network  ( Q ,  R )  with  intensity  R  from  p 
to  q,  where  p,  q  are  two  arbitrary  nodes  in  Q.  Assign  a  flow  i'  :  EJ  — >  Mfexfc  in  the 
graph  Q'  as 

i'ei  ■=  ^2  ie ’  i  e  (4-25) 

eSEj 

We  will  show  first  that  i'  is  the  current  in  (Q1 ,  R')  with  intensity  R  from  p  to  q. 
It  is  easy  to  see  that  i'  Kirchoff’s  current  law  (4.3).  To  check  if  Kirchoff’s  voltage 
law  and  Ohm’s  law  are  satisfied,  pick  an  arbitrary  j,  and  let  the  edges  in  Ej  be 
directed  from  u  to  v  for  some  u,v  E  E>.  Let  t  be  the  number  of  edges  in  the  set 
Ej,  and  denote  the  edges  in  Ej  as  e\,  e^,  ■  ■  ■ ,  e£.  Since  the  potential  drop  between 
u  and  v  in  the  network  ( Q ,  R)  is 

Vu  IA  Repe  i  Eepe2  ‘  Repe^-)  (4.26) 
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we  get 


R  ert  e  , 


(^ei  “1“  ^e2  “1“  •  •  •  ieu 


-1 


^  ]  Re  (*ei  +  Re2  Rei^ei  +  '  '  '  +  -Re<  Re i*ei) 


Rei^ei  ^  ;/  Idi- 


This  shows  that  i'  is  a  current  in  the  network  (C/7,  /?/)  with  node  potential  function 
V  -  the  same  as  in  (Q,R).  Therefore,  the  potential  drop  between  p  and  q,  which 
is  the  effective  resistance  between  them,  is  the  same  in  the  two  networks.  The 
same  argument  applies  to  all  node  pairs,  which  proves  the  result.  ■ 
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Chapter  5 


Error  scaling  laws 

5.1  Introduction 

In  this  chapter  we  answer  the  error-scaling  question  on  estimation  with  relative 
measurements,  that  was  raised  at  the  beginning  of  Chapter  1.  We  want  to  examine 
how  the  minimum  possible  estimation  error  of  a  node  variable  xu  varies  with  the 
node’s  distance  from  the  reference  node  o  in  a  large  measurement  graph,  and  how 
this  variation  is  affected  by  the  structure  of  the  measurement  graph. 

As  discussed  in  Chapter  1  and  again  in  Section  4.1,  the  error  scaling  question  is 
important  for  large  measurement  graphs.  Therefore  we  consider  the  limiting  case 
of  infinite  graphs,  in  which  the  number  of  variables  and  available  measurements 
are  countably  infinite.  It  was  shown  in  the  preceding  chapter  that  the  optimal 
estimation  error  of  a  node  variable  in  a  large  but  finite  subgraph  of  the  infinite 
graph  can  be  made  arbitrarily  close  to  the  error  in  the  infinite  graph,  by  making  the 
finite  graph  sufficiently  large.  Intuitively,  for  a  fixed  node,  as  the  graph  becomes 
larger  and  larger,  it  appears  to  extend  to  infinity  in  all  directions  from  the  point  of 
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view  of  the  node.  Therefore,  as  long  as  we  are  interested  in  the  error  covariance  of  a 
node  that  is  sufficiently  inside  a  large  graph  (i.e.,  not  too  close  to  the  boundary), 
conclusions  drawn  for  infinite  graphs  are  applicable  to  large  but  finite  graphs. 
The  advantage  is  that  analyzing  infinite  graphs  is  often  easier,  since  boundary 
conditions  are  not  as  important  in  an  infinite  graph  as  it  is  in  a  finite  graph.  In 
addition,  the  minimum  estimation  error  achievable  in  an  arbitrarily  large  graph 
can  be  characterized  by  the  limiting  BLUE  covariance  in  an  infinite  graph  that 
was  defined  in  the  previous  chapter.  For  this  reason,  in  this  chapter  we  examine 
the  scaling  laws  for  the  BL  U  estimation  error  covariance  in  infinite  measurement 
graphs. 

The  structure  of  a  measurement  graph  has  a  direct  bearing  on  how  the  esti¬ 
mation  error  of  a  node  variable  varies  with  its  distance  from  the  reference.  When 
the  measurement  graph  is  a  tree,  there  is  a  single  path  between  the  uth  node  and 
the  reference  node  and  one  can  show  that  the  covariance  matrix  of  the  estimation 
error  is  the  sum  of  the  covariance  matrices  associated  with  this  path.  Thus,  for 
trees,  the  variance  of  the  optimal  estimation  error  of  xu  grows  linearly  with  the 
distance  of  node  u  from  the  reference  node,  ft  turns  out  that  for  graphs  “denser” 
than  trees,  with  multiple  paths  between  pairs  of  nodes,  the  variance  of  the  optimal 
estimation  error  can  grow  less  than  linearly  with  distance. 

ffowever,  the  notion  of  denseness  of  a  graph  is  not  easy  to  define.  In  classical 
graph-theoretic  terminology,  a  graph  with  n  vertices  is  called  dense  if  its  average 
node  degree  is  of  order  n,  and  is  called  sparse  if  its  average  node  degree  is  a 
constant  independent  of  n  [110].  Recall  that  the  degree  of  a  node  refers  to  the 
number  of  edges  that  are  incident  on  it.  An  edge  (u,v)  is  said  to  be  incident  on 
the  nodes  u  and  v.  In  the  sensor  network  literature  that  examines  the  accuracy  of 


159 


location  estimation  from  range  measurement,  graph  density  is  recognized  to  affect 
estimation  accuracy;  although  graph  density  is  measured  by  the  average  number 
of  nodes  per  unit  area  of  a  deployed  network  [15,  16].  However,  none  of  these 
measures  determines  how  the  estimation  error  scales  with  the  size  of  the  graph, 
as  we  will  see  through  examples  in  Section  5.3.3. 

A  notion  of  graph  denseness  and  sparseness  that  is  useful  in  predicting  error 
scaling  laws  can  be  developed  by  examining  the  relationship  between  the  graph 
and  a  lattice.  The  d- dimensional  square  lattice  Z d  is  defined  as  a  graph  with  a 
node  in  every  point  in  Wl  with  integer  coordinates  and  an  edge  between  every 
pair  of  nodes  that  have  a  Euclidean  distance  1  between  them  (see  Figure  5.3 
for  examples).  The  error  scaling  laws  for  a  lattice  measurement  graph  can  be 
determined  analytically  by  exploiting  the  symmetry  of  the  lattice. 

When  the  graph  is  not  a  lattice,  it  can  still  be  compared  to  a  lattice.  Intu¬ 
itively,  if  a  graph,  after  some  bounded  perturbation  in  its  node  and  edge  set,  looks 
approximately  like  a  d-dimensional  lattice,  the  graph  is  as  dense  as  a  lattice.  In 
that  case  the  error  covariance  in  the  lattice  can  be  used  to  bound  the  error  co- 
variance  in  the  graph.  It  also  turns  out  that  the  graphs  that  can  be  compared  to 
lattices  in  this  manner  are  realistic  models  of  sensor  networks  obtained  by  placing 
nodes  in  a  geographical  area  in  an  ad-hoc  fashion.  Thus,  the  error  scaling  laws 
obtained  for  these  graphs  turn  out  to  be  quite  useful  in  design  and  deployment  of 
realistic  sensor  networks. 

Chapter  organization:  After  outlining  the  contributions  of  this  chapter  in  Sec¬ 
tion  5.2,  we  pose  the  error  scaling  problem  precisely  in  Section  5.3.  This  section 
also  contains  the  main  results  of  this  chapter  in  the  form  of  theorems  and  lem¬ 
mas.  Although  the  characterization  of  graphs  according  to  how  fast  the  errors 
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grow  with  distance  are  provided  in  the  beginning  of  Section  5.3.  in  Section  5.4  we 
return  to  these  graphs  for  a  more  extensive  discussion  of  their  properties.  Sec¬ 
tion  5.5  starts  with  scaling  laws  for  lattices  and  a  close  relative  of  theirs,  and  hints 
at  how  these  can  be  extended  to  a  much  broader  class  by  thinking  of  graphs  as 
coarse  approximation  of  Euclidean  spaces,  and  ends  by  providing  the  formal  proof 
of  the  scaling  laws  that  were  stated  earlier  in  Section  5.3.  The  chapter  ends  with 
a  discussion  of  open  problems  in  Section  5.6. 


5.2  Contributions  and  prior  work. 

The  results  established  in  this  chapter  and  their  implications  are  summarized 
below: 

1.  We  derive  a  classification  of  graphs,  dense  and  sparse  graphs  in  Wl,  d  = 
1,  2,  3,  that  determines  the  rate  at  which  the  limiting  BLUE  covariance  of  a 
node  variable  changes  with  the  node’s  distance  from  the  reference.  For  dense 
graphs,  upper  bounds  on  the  growth  rate  of  the  error,  and  for  sparse  graphs, 
lower  bounds  on  the  estimation  error  growth,  are  obtained.  In  particular, 
when  a  graph  is  dense  in  ID,  2D,  and  3D,  respectively,  the  error  covariance 
of  a  node  is  upper  bounded  by  a  linear,  logarithmic,  and  bounded  function  of 
its  distance  from  the  reference.  On  the  other  hand,  when  a  graph  is  sparse 
in  ID,  2D,  and  3D  respectively,  the  error  covariance  of  a  node  is  lower 
bounded  by  a  linear,  logarithmic,  and  bounded  function  of  its  distance  from 
the  reference. 

2.  The  error  scaling  laws  derived  in  this  chapter  puts  an  algorithm-independent 
limit  to  the  estimation  accuracy  achievable  in  large  networks,  since  no  esti- 
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mation  algorithm  can  achieve  higher  accuracy  than  the  optimal  estimator. 
For  this  reason,  the  bounds  and  the  associated  graph  classification  can  be 
useful  in  performance  analysis,  design,  and  deployment  of  large  networks. 
For  example,  when  a  graph  is  sparse  in  ID,  the  optimal  estimation  error  co- 
variance  grows  at  least  linearly  with  the  distance  from  the  reference.  There¬ 
fore  the  estimation  accuracy  will  be  necessarily  poor  in  ID  sparse  graphs. 
Recognizing  the  sparseness  of  the  graph  will  help  the  user  to  realize  that, 
either  high  estimation  accuracy  cannot  be  achieved,  or  more  reference  nodes 
need  to  be  introduced.  On  the  other  hand,  when  a  graph  is  dense  in  3D,  the 
optimal  estimation  error  of  every  node  variable  remains  below  a  constant, 
even  for  nodes  that  are  arbitrary  far  away  from  the  reference  node.  So  ac¬ 
curate  estimation  is  possible  in  3D  dense  graphs.  One  can  therefore  try  to 
deploy  networks  that  satisfy  denseness  properties  so  that  guarantees  on  the 
estimation  error  can  be  provided  a-priori. 

3.  We  show  that  graphs  obtained  by  placing  nodes  in  a  geographical  area  in 
an  ad-hoc  fashion  are  likely  to  fall  into  one  of  the  classes  of  graphs  identi¬ 
fied  here.  Since  we  now  know  which  structural  properties  are  beneficial  for 
accurate  estimation,  and  such  structures  are  achievable  by  realistic  sensor 
networks,  we  can  strive  to  achieve  those  structures  in  deploying  a  network. 

4.  The  results  described  in  this  chapter  expose  certain  misconceptions  that 
exist  in  the  sensor  network  literature  about  the  relationship  between  graph 
structure  and  estimation  error.  In  Section  5.3.3,  we  provide  examples  that 
show  the  inadequacy  of  the  usual  measures  of  graph  denseness,  such  as  node 
degree,  in  determining  scaling  laws  of  the  estimation  error. 

The  material  presented  in  this  chapter  was  published  in  a  preliminary  form  in  [111]. 
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Prior  work:  Although  the  problem  of  estimation  from  relative  measurements 
appears  in  several  sensor  and  ad-hoc  network  applications,  there  has  been  no 
systematic  study  on  the  effect  of  network  structure  on  the  achievable  estimation 
error.  Among  all  the  possible  applications  discussed  in  Section  2.1,  localization 
and  time-synchronization  in  sensor  networks  have  attracted  the  most  interest  from 
the  research  community. 

However,  the  the  majority  of  the  existing  literature  on  localization  is  concerned 
with  developing  algorithms  for  estimating  locations  from  relative  range  measure¬ 
ments  alone,  as  discussed  in  Section  2.1.1.  Notwithstanding  this  major  difference 
of  the  problem  formulation,  the  few  papers  that  have  attempted  to  examine  the 
effect  of  various  parameters  on  the  accuracy  of  localization  from  range  measure¬ 
ments,  such  as  [15,  16,  37,  112],  fails  to  provide  a  clear  answer  to  the  question  of 
how  network  structure  affects  estimation  accuracy.  Most  of  the  papers  concluded 
that  high  node  degree  is  beneficial  to  estimation  accuracy  [15,  37]. 

There  is  a  substantial  body  of  literature  on  time-synchronization  from  relative 
clock  offset  measurements,  but  the  typical  estimation  algorithms  do  not  attempt  to 
compute  the  optimal  estimates.  To  the  best  of  our  knowledge,  optimal  clock  offset 
estimation  from  relative  measurements  was  examined  for  the  first  time  by  Karp 
et  al.  [73],  and  thereafter  by  Barooah  et  al.  [65],  Barooah  and  Hespanha  [67]  and 
then  by  [113].  However,  the  focus  of  these  papers  was  distributed  computation  of 
the  estimates  and  not  the  examination  of  network  structure’s  effect  on  achievable 
estimation  accuracy. 
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5.3  Problem  statement  and  main  results 


Recall  that  in  this  chapter  we  consider  the  problem  of  estimating  a  countably 
infinite  vector- valued  variables  xu  G  Mfc,  u  G  ‘V  :=  {1,2,...},  from  noisy  relative 
measurements  of  the  form: 

Ci u  -Ev  "h  (u,  u)  G  £ 

where  eU)V  denotes  a  zero-mean  measurement  noise  and  £  is  the  set  of  ordered 
pairs  ( u ,  v )  for  which  relative  measurements  are  available.  We  assume  that  the 
value  of  a  particular  reference  variable  xa  is  known,  and  without  loss  of  generality 
we  take  xa  =  0.  The  node  set  V  and  the  edge  set  £  together  define  a  directed 
measurement  graph  Q  =  (£{  £). 

The  accuracy  of  a  node  variable’s  estimate,  measured  in  terms  of  the  covariance 
of  the  estimation  error,  depends  on  the  graph  Q  as  well  as  the  measurement 
errors.  The  covariance  matrix  of  the  error  eU)V  in  the  measurement  (u>v  is  denoted 
by  Pu,vi  be.,  PU)V  :=  E[eUi„e;{J.  The  measurement  errors  on  different  edges  are 
uncorrelated,  i.e.,  for  every  pair  of  distinct  edges  e,  e  G  “E,  E[ee£g]  =  0.  The 
estimation  problem  is  now  formulated  in  terms  of  a  network  ( Q ,  P )  where  P  : 
£  — >  8fc+  is  a  function  that  assigns  to  each  edge  ( u ,  v)  G  £  the  covariance  matrix 
PU)V  of  the  measurement  error  associated  with  the  edge  (u,  v)  in  the  measurement 
graph  Q . 

The  problem  is  to  determine  how  the  BLUE  covariance  Eu  o  scales  as  a  function 
of  the  distance  of  node  u  from  the  reference  o,  and  how  this  scaling  law  depends 
on  the  structure  of  the  measurement  graph  Q.  In  view  of  the  electrical  analogy 
established  in  the  previous  chapter,  specifically  Theorem  4.5.1,  the  question  can 
be  equivalently  posed  in  terms  of  the  generalized  effective  resistance.  In  the  sequel, 


164 


we  only  deal  with  the  effective  resistance.  Now  we  define  a  classification  of  graphs 
for  which  the  scaling  laws  for  the  effective  resistance,  and  thereby  of  the  optimal 
estimation  error,  can  be  determined. 

Remark  5.3.1  (Assumptions).  Recall  from  Chapter  4  that  a  measurement  network 
is  assumed  to  satisfy  the  conditions  of  Assumption  4.2.1,  which  stipulates  that  the 
measurement  graph  is  weakly  connected,  it  has  a  finite  maximum  node  degree, 
and  the  edge  covariances  are  uniformly  bounded.  In  this  chapter,  we  impose 
the  additional  assumption  that  there  are  no  parallel  edges  in  the  measurement 
graph.  This  assumption  is  not  restrictive  since  parallel  measurement  edges  can  be 
combined  into  a  single  one  with  an  appropriate  covariance,  which  preserves  the 
BLUE  covariances,  which  follows  from  Proposition  4.6.1  and  the  analogy  between 
BLU  covariance  and  effective  resistance.  □ 

5.3.1  Graph  denseness  and  sparseness 

We  start  with  graph  drawing,  which  will  allow  us  to  define  dense  and  sparse 
graphs. 

5. 3. 1.1  Graph  drawing 

The  drawing  of  a  graph  Q  =  (rlA,‘E)  in  a  d- dimensional  Euclidean  space  is 
obtained  by  mapping  the  nodes  into  points  in  by  a  drawing  function  f  :  V  — > 
Wl.  A  drawing  is  also  called  a  representation  of  a  graph  [60].  For  a  particular 
drawing  /  of  a  graph  Q  =  ('Id,  *E),  given  two  nodes  u,  v  G  V  the  Euclidean  distance 
between  u  and  v  induced  by  the  drawing  f  :  V  — >  M.d  is  defined  by 

df(u,v)  :=  || f(v)  -  f(u) ||, 
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where  ||  ■  ||  denoted  the  usual  Euclidean  norm  in  It  is  important  to  emphasize 
that  the  definition  of  drawing  does  not  require  edges  to  not  intersect  and  therefore 
every  graph  has  a  drawing  in  every  Euclidean  space.  In  fact,  every  graph  has  an 
infinite  number  of  drawings  in  every  Euclidean  space.  However,  some  drawings 
are  more  useful  than  others  in  clarifying  the  relationship  between  the  graph  and 
the  Euclidean  space  in  which  it  is  drawn.  It  is  this  relationship  that  is  the  key  to 
defining  an  appropriate  measure  of  graph  denseness  and  sparseness. 


For  a  particular  drawing  /  and  induced  Euclidean  distance  df  of  a  graph 
Q  =  (fJd,  *E),  four  parameters  can  be  used  to  characterize  graph  denseness  and 
sparseness.  The  minimum  node  distance ,  denoted  by  s,  is  defined  as  the  minimum 
Euclidean  distance  between  the  drawing  of  two  nodes 

s  inf  df(u,v). 

V^U 

The  maximum  connected  range,  denoted  by  r,  is  defined  as  the  Euclidean  length 
of  the  drawing  of  the  longest  edge 

r sup  df(u,v). 

The  maximum  uncovered  diameter,  denoted  by  7,  is  defined  as  the  diameter  of 
the  largest  open  ball  that  can  be  placed  in  M,d  such  that  it  does  not  enclose  the 
drawing  of  a  node 

7  :=  snp  j<5 :  3 Bs  s.t.  f(u)  ^  e  'h'j, 

where  the  existential  quantification  spans  over  the  balls  B§  in  M.d  with  diameter  6 
and  centered  at  arbitrary  points.  Finally,  the  asymptotic  distance  ratio ,  denoted 
by  p,  is  defined  as 

p  :=  lim  inf  {  — — -  :  u,v  G  V  and  dc(u,v )  >  n\, 
n^oo  l  dg(u,V)  J 
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where  dg(u,v )  denotes  the  graphical  distance  between  u  and  v  in  the  graph  Q. 
Essentially  p  provides  a  lower  bound  for  the  ratio  between  the  Euclidean  and  the 
graphical  distance  for  nodes  that  are  far  apart.  The  asymptotic  distance  ratio 
can  be  thought  of  as  an  inverse  of  the  stretch  for  geometric  graphs,  which  is  a 
well-studied  concept  for  finite  graphs  [114]. 

If  the  asymptotic  distance  ratio  p  is  positive  for  the  drawing  of  graph,  or 
its  maximum  connected  range  r  is  finite,  it  says  something  about  the  relation  be¬ 
tween  the  graphical  distances  between  nodes  the  Euclidean  distance  between  their 
drawings,  which  is  stated  in  the  next  result.  The  proof  is  provided  in  Section  5.7. 

Lemma  5.3.1  (p  vs.  linear  growth).  The  following  two  statements  are  equiv¬ 
alent: 

1.  The  asymptotic  distance  ratio  p  is  strictly  positive. 

2.  There  exist  constants  a  >  0,  f3  >  0  for  which 

dg(u,v)  <  adf(u,v)  +  /3,  Vu,v  E  T>. 

Similarly,  the  following  statements  are  equivalent: 

1.  The  maximum  connected  range  r  is  finite. 

2.  There  exist  real  constants  a  >  0  ,/3  >  0  for  which 

df(u,v)<adg(u,v)  +  /3,  V.  □ 

In  other  words,  when  p  >  0,  small  Euclidean  distance  in  the  drawing  implies 
small  graphical  distance  in  the  drawing.  On  the  other  hand,  when  r  <  oo,  small 
graphical  distance  implies  small  Euclidean  distance. 
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Figure  5.1.  A  drawing  of  a  graph  in  2D  Euclidean  space,  and  the  corresponding 
denseness  and  sparseness  parameters.  Since  the  minimal  distance  between  any 
two  nodes  is  1,  so  the  minimum  node  distance  is  s  —  1.  Since  the  longest  edge 
is  between  u*  and  v*,  of  length  a/IO,  the  maximum  connected  range  is  r  =  \/l0. 
The  diameter  of  the  largest  ball  that  can  fit  inside  the  drawing  without  enclosing 
any  node  is  2,  the  maximum  uncovered  diameter  is  thus  7  =  2.  The  minimal  ratio 
between  the  Euclidean  and  graphical  distance  of  a  pair  of  nodes  is  achieved  by  the 
pair  p*,q *,  hence  the  asymptotic  distance  ratio  is  p  =  df(p*,q*)/dg(p*,q*)  =  1/5. 

5. 3. 1.2  Dense  and  Sparse  Graphs 

The  drawing  of  a  graph  for  which  the  maximum  uncovered  diameter  is  finite 
(7  <  cx))  and  the  asymptotic  distance  ratio  is  positive  (p  >  0)  is  called  a  dense 
drawing.  We  say  that  a  graph  Q  is  dense  in  if  there  exists  a  dense  drawing  of 
the  graph  in  WLd.  Intuitively,  these  drawing  are  dense  in  the  sense  that  the  nodes 
can  cover  M.d  without  leaving  large  holes  between  them  and  still  having  sufficiently 
many  edges  so  that  a  small  Euclidean  distance  between  two  nodes  in  the  drawing 
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guarantees  a  small  graphical  distance  between  them. 

A  graph  drawing  for  which  the  minimum  node  distance  is  positive  (s  >  0)  and 
the  maximum  connected  range  is  finite  (r  <  oo)  is  called  a  civilized  drawing  [2], 
A  graph  Q  is  said  to  be  sparse  in  18Ld  if  there  exists  a  civilized  drawing  of  it  in 
Wl.  Intuitively,  these  drawings  are  sparse  in  the  sense  that  one  can  keep  the  edges 
with  finite  lengths  without  cramping  all  nodes  on  top  of  each  other. 

Remark  5.3.2  (historical  note).  A  graph  that  is  sparse  in  Wl  is  a  graph  that  can 
be  drawn  in  a  civilized  manner  in  where  the  notion  of  “a  graph  that  can  be 
drawing  a  civilized  manner”  was  introduced  by  Doyle  and  Snell  [2]  in  connection 
with  random  walks.  In  this  dissertation  we  refer  to  such  graphs  as  sparse  graphs 
since  they  are  the  antitheses  of  dense  graphs.  □ 

A  graph  can  be  both  dense  and  sparse  in  the  same  dimension.  For  example, 
consider  the  d- dimensional  square  lattice  Z d,  which  is  defined  as  a  graph  with  a 
node  in  every  point  in  M0'  with  integer  coordinates  and  an  edge  between  every 
pair  of  nodes  that  have  a  Euclidean  distance  1  between  them  (see  Figure  5.3 
for  examples).  We  can  conclude  from  the  definition  of  a  lattice  (which  defines  a 
drawing  as  well)  that  the  d-dimensional  lattice  is  both  sparse  and  dense  in  Wl. 
However,  there  is  no  civilized  drawing  of  the  d-dimensional  lattice  in  M-  for  any 
d  <  d.  Moreover,  there  is  no  dense  drawing  of  the  d-dimensional  lattice  in  Wl  for 
every  d  >  d.  This  means,  for  example,  that  the  3D  lattice  in  not  sparse  in  2D  and 
is  not  dense  in  4D.  In  general,  a  graph  being  dense  in  a  particular  dimension  puts 
a  restriction  on  which  dimensions  it  can  be  sparse  in.  The  next  result,  proved  in 
Section  5.4.2,  states  this  precisely. 

Lemma  5.3.2.  If  a  graph  is  dense  in  M.d  for  some  d  >  1,  it  is  not  sparse  in  M- 
for  every  d  <  d.  □ 
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Euclidean  space 

Covariance  matrix  o  of 

the  estimation  error  of  xu 
in  a  sparse  graph  with  a 
civilized  drawing  f', 

Covariance  matrix  T,U)0  of 
the  estimation  error  of  xu 
in  a  dense  graph  with  a 
dense  drawing  fd 

R 

Eu,o(G)  =  Ll(df^(u,o)^ 

£ u,o(G )  =  o(dfl{u,o )) 

-  M2 

Xu,o(G)  =  ^(logd/'(w,o)) 

£u,o(0)  =  C?(log  d/a(w,o)) 

k  M3 

Zu,o(G)  =ft(l) 

Eu,o{G)  =  o(l) 

Tabic  5.1.  Covariance  matrix  EU)0  of  xu’s  optimal  estimate  for  graphs  that  are 
dense  or  sparse  in  Rd.  In  the  table,  dfd(u,o )  denotes  the  Euclidean  distance 
between  node  u  and  the  reference  node  o  for  any  drawing  fd  ■  T’  —>  that 
establishes  the  graph’s  denseness  in  the  Euclidean  space  Kd,  and  df^u,  o )  denotes 
the  Euclidean  distance  in  any  drawing  f'd  that  establishes  the  graph’s  sparseness 
in  Md. 

5.3.2  Error  scaling  laws 

The  concepts  of  dense  and  sparse  graphs  allow  one  to  characterize  precisely 
how  the  BLUE  error  covariance  £U)0  grows  with  the  distance  of  the  node  u  from 
the  reference  o  in  infinite  measurement  graphs.  The  following  theorem,  proved  in 
Section  5.5.3,  establishes  the  scaling  laws  for  the  BLUE  error  covariances  in  dense 
and  sparse  graphs.  The  theorem  is  an  answer  to  the  error  scaling  question  raised 
in  Chapter  1. 

Theorem  5.3.1  (Error  Scaling  Laws).  Consider  a  measurement  network  (Q,  P) 
that  satisfies  Assumption  f.2.1,  such  that  the  graph  Q  =  (T’,  £)  has  a  reference 
node  o  6  V.  Then,  the  limiting  BLUE  error  covariance  TiU:0  for  every  node 
u  G  V  \  {o}  obeys  the  scaling  laws  shown  in  Table  5.1.  □ 
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In  Table  5.1,  the  usual  asymptotic  notations  fi(-)  and  O(-)  are  used  with  matrix 
valued  functions  in  the  following  way.  For  a  matrix-valued  function  g  :  R  — >  Rfexfc 
and  a  scalar- valued  function  p  :  R  — >  R,  the  notation  g(x )  =  0(p(x))  means  that 
there  exists  a  positive  constant  xQ  and  a  constant  matrix  A  e  Sfc+  such  that 
g(x)  <  Ap(x)  for  all  x  >  xQ.  Similarly,  g(x)  =  tt(p(x))  means  there  exists  a 
positive  constant  xQ  and  a  constant  matrix  B  e  Sfc+  such  that  g(x)  >  Bp(x)  for 
all  x  >  xQ.  Recall  that  §fc+  is  the  set  of  all  k  x  k  symmetric  positive  definite 
matrices. 

If  a  graph  is  both  sparse  and  dense  in  a  particular  Euclidean  space  Rd,  the 
asymptotic  upper  and  lower  bounds  for  the  error  covariance  is  the  same.  The 
effective  resistance  in  such  a  graph  grows  with  distance  in  the  same  rate  as  it 
grows  in  the  d-D  lattice.  Intuitively,  such  a  graph  behaves  approximately  like  a 
lattice. 

Since  a  graph  can  be  dense  and  sparse  in  multiple  dimensions,  one  may  wonder 
if  it  is  possible  to  encounter  the  situation  in  which  a  graph  is  dense  in  R2  as  well 
as  sparse  in  R,  which  will  lead  to  a  logarithmic  upper  bound  in  one  drawing  and 
a  linear  lower  bound  in  another  drawing.  Such  an  undesirable  situation,  however, 
is  precluded  by  Lemma  5.3.2. 

5.3.3  Counterexamples  to  conventional  wisdom 

It  was  pointed  out  in  Section  5.1  that  typically,  the  average  node  degree  of  a 
graph  or  the  number  of  nodes  and  edges  per  unit  area  of  a  deployed  network  is  used 
as  a  measure  of  graph  denseness.  However,  these  measures  do  not  predict  error 
scaling  laws.  The  three  graphs  in  Figure  5.2  offer  an  example  of  the  inadequacy 
of  node  degree  as  a  measure  of  denseness.  It  shows  a  3-fuzz  of  the  ID  lattice  (see 
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Section  5.4  for  the  definition  of  a  lattices  and  fuzzes),  a  triangular  lattice,  and 
a  3-dimensional  lattice,  ft  can  be  verified  from  the  definitions  in  Section  5. 3. 1.2 
that  the  3-fuzz  of  the  ID  lattice  is  both  dense  and  sparse  in  M,  the  triangular 
lattice  is  dense  and  sparse  in  M2,  and  the  3D  lattice  is  dense  and  sparse  in  M3. 
Thus,  it  follows  from  Theorem  5.3.1  that  the  BLU  estimation  error  scales  linearly 
with  distance  in  the  3-fuzz  of  the  of  the  ID  lattice,  logarithmically  with  distance 
in  the  triangular  lattice,  and  is  uniformly  bounded  with  respect  to  distance  in  the 
3D  lattice,  even  though  every  node  in  each  of  these  graphs  has  the  same  degree, 
namely  six. 


5.4  Dense  and  sparse  graphs 

The  dense  and  sparse  graphs  defined  in  Section  5. 3. 1.2  have  a  special  relation¬ 
ship  with  lattices  and  a  close  relative  of  lattices  -  called  lattice  fuzzes  -  in  terms 
of  embedding. 

Recall  that  the  d- dimensional  square  lattice  is  defined  as  a  graph  with  a 
node  in  every  point  in  Wl  with  integer  coordinates  and  an  edge  between  every 
pair  of  nodes  that  have  a  Euclidean  distance  1  between  them  (see  Figure  5.3  for 
examples).  The  /?,-fuzz  of  a  graph  Q,  introduced  by  Doyle  and  Snell  [2]  is  a  graph 
with  the  same  set  of  nodes  as  Q  but  with  a  larger  set  of  edges.  Given  a  graph 
Q  and  a  a  positive  integer  h,  the  h-fuzz  of  Q,  denoted  by  Q^h\  is  the  graph  that 
has  an  edge  between  two  nodes  u  and  v  whenever  the  graphical  distance  between 
them  in  Q  is  less  than  or  equal  to  h.  The  graphical  distance  do  is  evaluated 
without  regards  to  edge  directions.  In  view  of  remark  4.6.1,  the  edge  directions 
are  irrelevant  so  long  as  we  are  interested  only  in  the  effective  resistance. 
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(a)  A  3-fuzz  of  a  ID  lattice 


(b)  A  triangular  lat¬ 
tice 


(c)  A  3D  lattice 


Figure  5.2.  Three  measurement  graphs  that  show  vastly  different  scaling  laws 
of  the  estimation  error,  whereas  each  has  the  same  node  degree  for  every  node. 
Furthermore,  they  are  all  “sparse”  according  to  traditional  graph-theoretic  termi¬ 
nology  (see  the  discussion  on  graph  denseness  in  Section  5.1). 
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T  -J--;-  4. 


(a)  ID  lattice  Z\  (b)  2D  lattice  Z2  (c)  3D  lattice  Z3 


Figure  5.3.  Lattices. 

Recalling  the  definition  of  embedding  from  Section  4.6.1,  and  noting  that  there 
are  no  parallel  edges  by  assumption  (see  Remark  5.3.1),  we  note  that  Q  can  be 
embedded  in  Q  if  it  can  be  made  a  subgraph  of  Q  by  relabeling  its  nodes  and 
disregarding  edge  directions.  Recall  also  that  we  write  Q  C  Q  to  denote  Q  can  be 
embedded  in  Q. 


5.4.1  Relationship  with  lattices 

The  next  theorem  shows  that  fuzzes  of  dense  graphs  can  embed  lattices.  The 
proof  of  the  result  is  provided  in  Section  5.7.  We  use  hzd(-)  to  denote  the  graphical 
distance  in  Zd  and  df(-)  to  denote  the  Euclidean  distance  in  induced  by  a 
drawing  /. 

Theorem  5.4.1  (Dense  Embedding).  A  graph  Q  =  (T*,  £)  is  dense  in  if 
and  only  if  there  exists  finite,  positive  integers  h  and  c  such  that  the  following 
conditions  are  satisfied 

1.  D  Zd,  and, 
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2.  if  r)  :  <Z/z,/  — >  V  is  an  embedding  of  Zci  into  Q^h\  then,  Vw  6  3 u  E 

rjifUz d)  —  ^  such  that  dg{u,u )  <  c. 


Moreover,  if  f  :  V  — >  Rd  is  a  dense  drawing  of  Q  in  Rd  and  rj  is  an  embedding 
function  that  satisfies  condition  (2)  above,  then  the  following  is  also  true:  \/u,v  G 
V,  we  can  find  uz,vz  G  which  satisfies 


dg(u,T](uz))  <  C,  dg(v,T](vz ))  <  C 

dzd(uz,  vz)  <  4d  H - df{u ,  v) 

7 

where  7  zs  i/ze  maximum  uncovered  diameter  of  the  f  -drawing  of  Q . 


(5.1) 

□ 


In  other  words,  £/  is  dense  in  if  and  only  if  (i)  the  cJ-dimensional  lattice  can 
be  embedded  in  an  /i-fiizz  of  for  some  positive  integer  h  and  (ii)  every  node  of  Q 
that  is  not  the  image  of  a  node  in  is  at  a  uniformly  bounded  graphical  distance 
from  a  node  that  is  the  image  of  a  node  in  Z^.  The  significance  of  (5.1)  is  that 
not  only  can  we  find  for  every  node  in  Q  a  close-by  node  that  has  a  pre-image  in 
the  lattice,  but  also  these  close-by  nodes  can  be  so  chosen  that  if  the  Euclidean 
distance  between  a  pair  of  nodes  u  and  v  in  a  dense  drawing  of  the  graph  is  small, 
then  the  graphical  distance  in  the  lattice  between  the  pre-images  of  their  close-by 
nodes  is  small  as  well. 

The  next  theorem  shows  that  a  graph  that  is  sparse  in  M.d  can  be  embedded 
in  a  fuzz  of  the  ci-dimensional  lattice.  The  proof  of  the  theorem  is  provided  in 
Section  5.7. 


Theorem  5.4.2  (Sparse  Embedding).  A  graph  Q  =  (T’,  £)  is  sparse  in  Wl 
if  and  only  if  there  exists  a  positive  integer  h  such  that  Q  C  Z^ .  Moreover,  if 
f  :  V  — >  Rd  is  a  civilized  drawing  of  Q  in  R0',  then  there  exists  an  embedding 
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7]  :  V  — >  Tz,;  so  that  \/u,  v  E  CP>, 

dZd(v(u),ri(v))  >  Vd,  Q df(u ,  v)  -  2^j  .  (5.2) 

,  where  s  is  the  minimum  node-distance  of  the  f -drawing  of  Q.  □ 

In  other  words,  Q  is  sparse  in  Wl  if  and  only  if  Q  can  be  embedded  in  an 
/?,-fuzz  of  a  d-dimensional  lattice.  The  significance  of  (5.2)  is  that  if  the  Euclidean 
distance  between  a  pair  of  nodes  in  a  civilized  drawing  of  the  graph  is  large,  the 
graphical  distance  in  the  lattice  between  their  corresponding  nodes  is  also  large. 

The  Erst  statement  of  the  theorem  is  essentially  taken  from  [2],  where  it  was 
proved  that  if  a  graph  can  be  drawn  in  a  civilized  manner  in  Md,  then  it  can  be 
embedded  in  a  h- fuzz  of  a  d- lattice,  where  h  depends  only  on  s  and  r.  A  careful 
examination  of  the  proof  reveals  that  it  is  not  only  sufficient  but  also  a  necessary 
condition  for  embedding  in  lattice  fuzzes. 

5.4.2  Checking  denseness  and  sparseness 

To  show  a  graph  is  dense  (or  sparse)  in  a  particular  dimension,  one  has  to  hnd 
a  drawing  in  that  dimension  with  the  appropriate  properties.  Dense  and  sparse 
graphs  occur  readily  with  realistic  “communication  range”  models,  in  which  nodes 
are  deployed  in  an  Euclidean  space  -  perhaps  randomly  -  and  two  nodes  form  an 
edge  between  them  if  they  are  within  range  of  each  other  [115].  A  widely  studied 
class  of  such  graphs  that  is  also  highly  relevant  for  engineering  applications  is  the 
random  geometric  graph  [109].  For  such  graphs,  a  natural  drawing  is  obtained 
by  mapping  the  nodes  to  their  physical  locations  in  the  Euclidean  space  they  are 
deployed  in.  We  can  show  using  the  natural  drawing,  that,  a  graph  generated 
by  placing  a  countable  number  of  nodes  in  IRd,  so  that  the  maximum  uncovered 
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diameter  7  of  its  natural  drawing  is  finite,  and  every  pair  of  nodes  whose  Euclidean 
distance  in  the  natural  drawing  is  less  than  26  has  an  edge  between  them,  is  dense 
in  Wl.  Such  communication-range  models  also  yield  sparse  graph  quite  easily,  since 
the  condition  of  finite  maximum  connected  range  is  satisfied  by  construction. 

On  the  other  hand,  to  show  that  a  graph  is  not  dense  (or  not  sparse)  in  a 
particular  dimension  is  harder  since  one  has  to  show  that  no  drawing  is  possible 
that  has  the  required  properties.  Typically,  this  can  be  done  by  showing  that  the 
existence  of  a  dense  (or  sparse)  drawing  leads  to  a  contradiction.  An  application 
of  this  technique  leads  to  the  following  result. 

Lemma  5.4.1.  1.  The  d-dimensional  lattice  Zj  is  not  sparse  in  R-  for  every 

d  <  d,  and  it  is  not  dense  in  Rd  for  every  d  >  d. 

2.  A  regular1  infinite  tree  is  not  dense  or  sparse  in  any  dimension.  □ 

The  first  statement  of  the  lemma  is  provided  in  Section  5.7.  The  proof  of  the 
second  statement  is  not  provided  since  the  method  of  the  proof  is  similar. 

We  are  now  ready  to  prove  Lemma  5.3.2. 

Proof  of  Lemma  5.3.2.  To  prove  the  result  by  contradiction,  suppose  that  a  graph 
Q  is  dense  in  M.d  as  well  as  sparse  in  Rrf,  where  d  <  d.  It  follows  from  Theo¬ 
rems  5.4.1  and  5.4.2  that  there  exist  positive  integers  £,p  such  that  Z^  C  Q ^  and 
Q  C  Zf  It  is  straightforward  to  verify  the  following  facts: 

1.  for  every  pair  of  graphs  Q,  Q  that  do  not  have  any  parallel  edges,  Q  C  Q  =>■ 
c  Qd)  for  every  positive  integer  l. 

1 A  graph  is  called  regular  if  the  degree  of  every  node  in  the  graph  is  the  same. 
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2.  for  an  arbitrary  graph  Q  without  parallel  edges,  and  two  positive  integers 
£,p,  we  have 

It  follows  that  Z ci  C  Z  which  means,  from  sparse  embedding  Theorem  5.4.2, 
that  a  d-dimensional  lattice  is  sparse  in  M.d.  This  is  a  contradiction  because  of 
Lemma  5.4.1,  which  completes  the  proof.  ■ 


5.5  Establishing  the  error  scaling  laws 

Here  we  briefly  outline  the  approach  by  which  the  error  scaling  laws  stated  in 
Theorem  5.3.1  are  obtained,  and  how  the  definitions  of  dense  and  sparse  graphs 
allow  one  to  obtain  those  results.  The  key  idea  is  to  embed  the  measurement 
graph  in  a  “nice”  looking  graph  such  that  the  effective  resistances  in  the  nice 
graph  can  be  computed.  Application  of  Rayleigh’s  monotonicity  law  then  tells  us 
that  the  effective  resistance  in  the  nice  graph  is  a  lower  bound  on  the  effective 
resistance  in  the  measurement  graph,  and  from  the  electrical  analogy  we  get  a 
lower  bound  on  the  BLLTE  covariance.  Similarly,  when  we  can  embed  a  nice  graph 
in  the  measurement  graph,  we  get  an  upper  bound  on  the  BLUE  covariances. 

The  nice  graphs  that  we  use  for  the  embedding  are  lattices  and  their  fuzzes. 
The  effective  resistance  in  lattices  and  their  fuzzes  can  be  analytically  computed 
because  of  the  symmetry  in  their  structure.  It  will  be  shown  shortly  that  the 
effective  resistance  in  ID,  2D  and  3D  lattices  is  a  linear,  logarithmic,  and  bounded 
function  of  distance.  Since  dense  graphs  can  embed  lattice-like  graphs,  namely 
fuzzes  of  lattices,  we  can  show  that  the  effective  resistance  in  graphs  that  are 
dense  in  ID,  2D,  and  3D  grow  as  a  linear,  logarithmic,  and  bounded  function  of 
distance  as  well.  A  similar  story  unfolds  for  sparse  graphs. 
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5.5.1  Effective  resistance  for  lattices  and  fuzzes 

An  h-fuzz  will  clearly  have  lower  effective  resistance  than  the  original  graph 
because  of  Rayleigh’s  Monotonicity  Law,  but  it  is  lower  only  by  a  constant  factor. 
The  following  result  states  this  feature  of  fuzzes,  which  is  a  straightforward  exten¬ 
sion  to  the  generalized  case  of  a  result  about  scalar  effective  resistance  established 
by  Doyle  [116].  Since  the  proof  in  [116]  uses  terminology  of  random  walks,  we  still 
include  a  proof  in  Section  5.5.3. 

Lemma  5.5.1.  Let  (Q,R0)  be  a  generalized  electrical  network  with  graph  Q  = 
(‘E ,  £)  satisfying  Assumption  f.2.1,  with  a  constant  generalized  resistance  R0  G 
§fc+  on  its  every  edge.  Let  (G^h\  R0 )  be  the  electrical  network  similarly  constructed 
on  G^h\  the  h-fuzz  of  G  ■  For  every  pair  of  nodes  u  and  v  in  V , 

aRfJS)  <  <  K„(S), 

where  is  the  effective  resistance  in  the  network  (•,  Ra )  and  a  G  (0,1]  is  a 

positive  constant  that  does  not  depend  on  u  and  v.  □ 

The  following  lemma  establishes  effective  resistances  in  d-dimensional  lattices 
and  their  fuzzes.  Note  that  infinite  generalized  networks  constructed  by  assigning 
constant  matrix-resistances  on  every  edge  of  a  lattice  or  a  h-fuzz  of  it  satisfies 
Assumption  4.2.1,  and  therefore  results  in  Chapter  4  guarantees  that  the  effective 
resistances  in  infinite  lattice  networks  are  well  defined. 

Lemma  5.5.2.  Consider  the  electrical  network  (Z %R0)  with  a  constant  general¬ 
ized  resistance  Ra  G  8fc+  at  every  edge  of  the  h-fuzz  of  the  d-dimensional  square 
lattice  Zd,  where  Let  h  is  an  integer.  The  generalized  effective  resistance  Rffv 
between  two  nodes  u  and  v  in  the  electrical  network  (Z^,R0)  satisfies 
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1.  R'SJZ^)  =  e(dZl(u,v)) 

g.  =  e(logdZs(ii,t>)), 

s.  R*(z?>)  =  e(i).  □ 

Proof  of  Lemma  5.5.2.  Consider  the  scalar  electrical  network  (Z^,  1)  formed  by 
assigning  a  1-Ohm  resistance  to  every  edge  of  the  d- dimensional  lattice  Z ci-  The 
effective  resistance  between  two  nodes  in  the  one-dimensional  lattice  network 
(Zi,  1)  is  given  by  rffv  =  dZl(u,v),  which  follows  from  series  resistance  for¬ 
mula.  In  the  2-dimensional  lattice  network  (Z2, 1),  the  effective  resistances  obeys 
rffv  =  O  (logdZ2(n,  v))  [105].  Similarly,  it  was  shown  in  [105]  that  for  the  scalar 
electrical  network  (Z3, 1),  the  effective  resistances  obeys  rffv  =  0(1).  The  results 
now  follow  upon  applying  Lemma  4.6.1,  which  allows  one  to  go  from  scalar  ef¬ 
fective  resistances  to  matrix- valued  case,  and  Lemma  5.5.1,  which  shows  that  the 
effective  resistance  in  a  graph  and  in  its  /i-fuzz  has  the  same  order.  ■ 

5.5.2  An  intuitive  explanation 

Before  proving  the  scaling  laws  for  dense  and  sparse  graphs,  we  offer  an  an 
intuitive  explanation,  which  comes  from  thinking  of  them  as  “coarse  approxima¬ 
tions”  of  the  respective  Euclidean  spaces.  In  fact,  thinking  of  the  graph  as  a 
metric  space,  with  the  graphical  distance  being  the  associated  metric,  such  ap¬ 
proximations  can  be  made  rigorous  if  mappings  between  the  node  set  and  points 
on  the  Euclidean  space  are  defined  that  preserve  distances  upto  some  constant 
factor  (see  [117]  for  a  thorough  exposition  of  this  topic).  A  dense  drawing  of  a 
graph  in  Wl  is  essentially  such  a  map,  which  ensures  that  the  distortion  (measured 
by  the  metric  in  the  respective  spaces,  Euclidean  or  graphical)  is  upper-bounded 
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K  _  1  dr  —  °°  for  d  =  1.2 
“  r*  1  <  oo  for  d  <  3 

Figure  41:  4» 

Figure  5.4.  Doyle  and  Snell  [2]’s  illustration  on  approximating  lattices  by  Eu¬ 
clidean  spaces. 


in  going  from  the  Euclidean  metric  space  to  the  graphical  metric  space.  This 
follows  from  Lemma  5.3.1. 

In  fact,  the  scaling  laws  can  be  explained  by  such  coarse  approximations,  by 
examining  the  effective  resistance  in  resistive  medium  filling  the  entire  Euclidean 
space.  The  following  quote  from  Doyle  and  Snell  [2]  explains  it  all: 

Suppose  we  replace  our  d-dimensional  resistor  lattice  by  a  (homoge¬ 
neous,  isotropic)  resistive  medium  filling  all  of  and  ask  for  the 
effective  resistance  to  infinity.  Naturally  we  expect  that  the  rotational 
symmetry  will  make  this  continuous  problem  easier  to  solve  than  the 
original  discrete  problem.  If  we  took  this  problem  to  a  physicist,  he 
or  she  would  probably  produce  something  like  the  scribblings  illustrated 
in  Figure  5. If2 ,  and  conclude  that  the  effective  resistance  is  infinite  for 
d  =  1;  2  and  finite  for  d  >  2. 


Although  Doyle  and  Snell  [2]  were  concerned  chiefly  about  resistances  grow¬ 
ing  to  infinity  or  staying  bounded,  we  can  conclude  much  more  from  continuum 
approximations,  once  we  recognize  that  the  matrix-valued  effective  resistance  be¬ 
haves  quite  similarly  to  the  scalar  valued  one.  By  elementary  calculations,  one 
2  The  figure  number  has  been  changed  here  for  obvious  reasons. 
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can  conclude  that  the  scalar  effective  resistance  in  a  metallic  rod  grows  as  the 
length  of  the  rod.  This  is  shown  in  Figure  5.4.  In  an  annular  plate  with  inner 
radius  ra  and  outer  radius  r,  the  effective  resistance  between  the  inner  and  outer 
boundaries  of  the  plate  is  an  logarithmic  function  of  the  radius  r,  when  r  is  large. 
In  a  sphere,  however,  similar  calculations  show  that  the  effective  resistance  stays 
bounded  by  a  constant  even  the  size  of  the  sphere  is  increased  without  bound. 
It  is  not  difficult  to  convince  ourselves  that  the  d- dimensional  lattice  is  a  good 
approximation  of  the  d-dimensional  Euclidean  space;  hence  it  is  no  surprise  that 
the  effective  resistances  grow  in  the  lattice  Z d  at  the  same  rate  as  they  do  in  Wl. 
The  results  on  effective  resistances  in  lattices  have  been  established  rigorously 
in  [105,  118]. 

In  going  from  lattices  to  more  general  measurement  graphs,  we  used  denseness 
and  sparseness  properties  to  compare  them  to  lattices  by  using  embedding.  How¬ 
ever,  a  better  understanding  is  obtained  by  comparing  dense  and  sparse  graphs 
to  the  Euclidean  spaces.  A  graph  that  is  dense  in  is  essentially  an  “upper- 
approximation”  of  the  Euclidean  space  Md,  in  the  sense  that  when  the  graph  is 
looked  at  through  blurring  lenses,  it  looks  at  least  as  dense  as  Md.  Since  the 
graphs  has  “more  conductive  material”  than  Md,  the  current  faces  less  resistance 
and  hence  the  effective  resistance  in  the  graph  grows  slowly  compared  to  the  ef¬ 
fective  resistance  in  the  Euclidean  space.  Figure  5.2  also  attempts  to  argue  this 
pictorially.  Similarly,  a  graph  that  is  sparse  in  McZ  is  a  “lower-approximation”  of 
Wl  -  it  is  at  least  as  sparse  as  1BLd.  Since  the  graph  has  “less  conductive  material” 
that  Md,  the  effective  resistance  in  the  graphs  grows  at  least  as  fast  as  in  Wl. 
Such  approximation  is  not  uncommon  in  other  fields  of  study.  The  held  of  coarse 
geometry,  for  example,  assumes  such  a  point  of  view  and  avers  “...two  spaces  that 
look  the  same  from  a  great  distance  are  actually  equivalent”  [119]. 
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Tabic  5.2.  An  intuitive  explanation  of  error  scaling  laws  by  continuum  approxi¬ 
mation.  The  notation  f(x)  ~  x  means  the  ratio  f(x)/x  goes  to  a  constant  as  x 
goes  to  infinity. 

5.5.3  Proof  of  the  Error  Scaling  Laws  Theorem  5.3.1 

We  now  prove  Theorem  5.3.1  by  using  all  the  tools  that  have  been  developed 
in  this  chapter  and  in  the  previous  one.  The  following  terminology  will  be  needed 
in  the  proofs.  For  a  matrix-valued  function  g  :  R  — >  Mfcxfc  and  a  scalar-valued 
function  p  :  K.  — *  M,  the  notation  g(y)  =  Q(p(y))  means  that  g(y)  =  Q(p(y  j)  and 
g(y)  =  0(p(y)).  The  asymptotic  notations  O  and  are  described  in  Section  5.3. 

Proof  of  Theorem  5.3.1.  [Upper  bounds:]  Throughout  the  proof  of  the  upper 
bounds,  we  will  use  R^V{Q),  for  any  graph  Q,  to  denote  the  effective  resistance 
between  nodes  u  and  v  in  the  electrical  network  (Q,  Pmax)  with  every  edge  of  Q 
having  a  generalized  resistance  of  Pmax.  Consider  the  generalized  electrical  net¬ 
work  (Q,  Pmax)  formed  by  assigning  a  constant  generalized  resistances  of  Pmax  to 
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every  edge  of  the  measurement  graph  Q.  From  the  Electrical  Analogy  theorem 
and  Monotonicity  Law  (theorems  4.5.1  and  4.6.1),  we  get 


<  <0(5)' 

Since  Q  is  dense  in  Md,  there  exist  a  dense  drawing  of  Q  in  Wl,  which  we  denote 
by  a  drawing  function  /,  and  a  positive  integer  h  such  that  the  d-D  lattice 
can  be  embedded  in  the  h- fuzz  of  Q.  Moreover,  Theorem  5.4.1  tells  us  that  there 
exists  uz,oz  G  P zd,  a  positive  constant  c,  and  an  embedding  77  :  Tzd  — >  ^  of  Z^ 
into  Q^h\  such  that 

dg(u,  rj(uz))  <  c,  dg(o,ri(oz ))  <  c  (5.3) 

dzd{uz,  oz)  <4 d  +  —df(u,  o),  (5.4) 

7 

where  7  is  the  maximum  uncovered  diameter  of  the  /-drawing  of  (/.  Note  that 
rj(uz),r](oz)  G  T/  Consider  the  electrical  network  (C/^,  Pmax)  formed  by  assigning 
every  edge  of  a  resistance  of  Pmax .  From  the  Triangle  Inequality  for  effective 
resistances  (Lemma  4.6.2), 

+  <5-5) 

For  any  two  nodes  117G  T’,  triangle  inequality  gives  us  R^V(G^)  <  dg(h)  (u,  v)Pmax  < 
jdg(u,v)Pmax.  Using  this  bound  in  (5.5),  and  by  using  (5.3),  we  get 

JOSU  <  yfU  +  <5-6) 

Since  d  Z^,  from  Rayleigh’s  Monotonicity  Law  (Theorem  4.6.1),  we  get 

<  Kljzdy 
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When  Q  is  dense  in,  say,  2D,  we  have  from  Lemma  5.5.2  that 


<JZ2)  =  @  (log  dz2{uz,  oz)) , 

which  implies 

^(«.),n(o.)(5<'‘))  =  °(  logdZl(«„oJ). 

Combining  this  with  (5.4)  and  (5.6),  we  get 

Since  is  a  bounded  degree  graph,  from  Lemma  5.5.1  we  know  that  the  effective 
resistance  in  Q  and  its  /?,-fuzz  is  of  the  same  order: 

<„(6)  =  e  «„(aw)) . 

which  gives  us  the  desired  result  that,  when  Q  is  dense  in  2D, 

Vv,o<B%0(g)  =  0(logdf(u,o)). 

The  statements  of  the  upper  bounds  for  1  and  3-dimensions  can  be  proved  simi¬ 
larly.  This  concludes  the  proof  of  the  upper  bounds  in  Theorem  5.3.1. 

[Lower  bounds:]  Throughout  the  proof  of  the  lower  bounds,  for  any  graph  Q ,  we 
will  use  R^fv{Q)  to  denote  the  effective  resistance  between  nodes  u  and  v  in  the 
electrical  network  ( Q ,  Pmin)  with  every  edge  of  Q  having  a  generalized  resistance 
of  Pmin-  Now  consider  the  generalized  electrical  network  ( Q ,  Pmin)  where  Q  is  the 
measurement  graph  Q.  From  the  Electrical  Analogy  and  Rayleigh’s  Monotonicity 
Law  (theorems  4.5.1  and  4.6.1),  we  get 

>  <„(£)■  (5.7) 

Since  Q  is  sparse  in  Wl,  it  follows  from  Theorem  5.4.2  that  there  exists  a  positive 
integer  h,  such  that  Q  C  Z^.  Let  r/  :  'V  Tzd  be  the  embedding  of  Q  into 
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Consider  the  generalized  electrical  network  (Z^\  Pmin)  formed  by  assigning  a 
generalized  resistance  of  Pmin  to  every  edge  of  Z^\  From  Rayleigh’s  monotonicity 
law,  we  get 

«5)  >  (5.8) 

where  uz  =  jrj(u),oz  =  77(0)  refer  to  the  nodes  in  that  correspond  to  the 
nodes  u,  o  in  Q.  When  the  graph  is  sparse  in,  say,  2D,  it  follows  from  (5.8)  and 
Lemma  5.5.2  that 


Rf,o{Q)  =  ^  (log dz2(uz,  oz)) 

=  D(log  df(u,o)), 

where  the  second  statement  follows  from  (5.2)  in  Theorem  5.4.2.  Combining  the 
above  with  (5.7),  we  get  Sno  =  D(log df(u,o)),  which  proves  the  lower  bound  in 
the  2D  case.  The  statements  for  the  lower  bounds  in  the  ID  and  3D  can  be  proved 
in  an  analogous  manner.  This  concludes  the  proof  of  the  theorem.  ■ 


5.6  Comments  and  open  problems 

We  established  a  classification  of  graphs,  namely,  dense  or  sparse  in  Rd ,  1  < 
d  <  3,  that  determines  how  the  optimal  estimator  error  of  a  node  grows  with  its 
distance  from  the  reference  node.  The  classification  of  dense  and  sparse  graphs 
is  interesting  only  for  infinite  graphs,  since  no  finite  graph  can  be  dense  in  any 
dimension  and  a  finite  graph  is  sparse  in  every  dimension. 

Although  infinite  graphs  are  a  compelling  approximation  to  large  finite  graphs, 
this  approximation  puts  a  constraint  that  the  scaling  laws  apply  only  to  nodes 
in  the  interior  of  the  finite  graphs.  For  a  node  that  is  not  close  to  the  boundary 
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of  the  graph,  the  graph  appears  as  if  it  extends  to  infinity  in  all  directions.  In 
that  case  we  can  regard  the  node  as  belonging  to  an  infinite  graph.  In  fact,  the 
results  in  the  Chapter  4  show  that  in  such  a  situation,  the  BLUE  covariance  of 
the  node  in  the  finite  measurement  graph  can  be  quite  close  to  the  covariance 
in  the  infinite  graph.  Still,  it  leaves  the  question  of  covariances  of  nodes  in  the 
boundary  open.  In  a  finite  graph,  it  may  be  of  interest  to  obtain  bounds  on 
the  maximum  BLLIE  covariance.  This  is  equivalent  to  obtaining  bounds  on  the 
maximum  effective  resistance  in  the  graph,  where  the  maximum  is  taken  over  all 
pairs  of  nodes.  It  may  be  possible  to  obtain  such  bounds  for  very  special  classes 
of  finite  graphs  such  as  rectangular  grids  by  using  known  results  on  the  effective 
resistance  in  such  graphs  [120].  However,  how  to  do  it  for  a  wider  class  of  finite 
graphs  is  an  open  question. 

Another  avenue  of  future  research  is  to  examine  the  role  of  randomness  in 
the  graph’s  structure  explicitly.  Although  the  dense  and  sparse  classification  we 
obtained  does  allow  randomness  in  the  structure  of  the  graph,  the  effect  of  such 
randomness  on  the  scaling  laws  for  the  error  is  not  explicitly  accounted  for  in 
the  present  work.  A  useful  research  direction  would  be  the  investigation  of  the 
estimation  error  covariances  in  graphs  with  random  structure,  such  as  random 
geometric  graphs  [109].  The  BLUE  covariance  itself  is  a  random  variable  in  these 
situations.  Another  interesting  avenue  is  the  exploration  of  BLLIE  covariances  in 
scale-free  graphs,  which  have  been  a  popular  -  if  somewhat  controversial  -  model 
for  many  large  scale  networks,  both  man-made  and  natural  [108]. 

Scale-free  networks  have  the  additional  difficulty  that  there  is  no  well-accepted 
definition  of  a  scale  free  graph  (see  [121]  for  a  extensive  discussion  on  this  issue). 
A  preliminary  investigation  of  a  related  class  of  graphs  called  Gromov  hyperbolic 
graphs,  which  have  the  advantage  of  at  least  being  well-defined,  have  been  un- 
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dertaken  in  [122],  Effective  resistance  in  small  world  graphs  have  been  studied 
in  [123]  using  methods  of  statistical  physics. 


5.7  Proofs 


Proof  of  Lemma  5.3.1.  We  prove  that  1  implies  2  by  contradiction.  Assuming 
that  2  does  not  hold,  we  have  that 

Va  V/3  3 u,v  G  V  such  that  dg(u,v)  >  adf(u,v )  +  f3. 

or  equivalently 


Va  V/9  3u,  v  G  V 


such  that 


df(u,v)  1  _  (3 

dg(u,v )  a  adg(u,v ) 

This  means  that  for  a  given  a,  (3,  the  set 

{  \  :  u,  v  G  V  and  dg(u,  v )  >  /?} 

ldg(u,v)  > 

contains  at  least  the  element 

df{u,v)  <  1  _  P  1 

dg(u,v)  ol  adg(u,v )  a 


and  therefore 


inf 


l  dg(u, V ) 
Making  f3  — >  oo  we  obtain  that 


:«,»G  T’  and  dg(u,v )  >  /?|  <  — . 


p  =  lim  inf  {  — -  :  u,  v  G  T’  and  dg(u,  v)  >  d\  < 

0->oo  l  dg(u,v)  V  '  _  J 
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But  since  a  can  be  arbitrarily  large,  the  above  actually  implies  that  p  =  0,  which 
contradicts  1. 


To  prove  that  2  implies  1,  we  note  that  when  2  holds,  we  conclude  that  for  every 
pair  of  nodes  m,d6  T’,  for  which  dg(u,v )  >  n,  we  have  that 


P 


df(u,y)  >  ^  _ 

dg(u,v )  —  a  dg(u,v )  ~  a 


> 


0 


n 


Vw  /  ®  6  'ZA 


Therefore, 


inf  {  ?  |  :  u,v  E  V  and  dg(u,  v)  >  nj  >  —  —  — . 

lag(u,u)  Jan 

As  n  — >  oo,  the  left-hand  side  converges  to  p  and  the  right-hand  side  converges  to 
-  >  0,  from  which  1  follows. 

a  ’ 


The  statements  about  the  maximum  connected  range  and  the  existence  of  con¬ 
stants  a  >  0,  (3  >  0  can  be  proved  in  a  manner  similar  to  that  above.  ■ 


Proof  of  Theorem  5.4.1.  We  will  denote  by  g  :  T>z_d  — >  IR<(  the  natural  drawing  of 
the  lattice  Z d- 

We  have  to  prove  that  if  Q  is  dense  in  conditions  (i)  and  (ii)  are  satisfied. 
Since  Q  is  dense  in  1BLd,  there  is  a  drawing  function  /  :  “V  — »  Wd  so  that  the 
/-drawing  of  Q  has  a  7  <  00  and  p  >  0.  Define  a  new  drawing  f  :  — >  Wl  as 

/'(«)  =  -/(«),  e  V, 

7 

so  that  the  maximum  uncovered  diameter  7'  of  the  f  drawing  of  Q  is  1.  Note 
that  f  is  still  a  dense  drawing  of  Q.  Now  we  superimpose  the  natural  ^-drawing 
of  Z d  on  the  /'-drawing  of  Q,  and  draw  open  balls  of  diameter  one  B(g(uz),  |) 
centered  at  the  natural  drawing  g(uz)  of  every  lattice  node.  Figure  5.5  shows 
an  example  in  M2.  Since  7'  =  1,  it  follows  from  the  definition  of  denseness  that 
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Figure  5.5.  Superimposing  a  2-dimensional  lattice  (brown)  on  a  2-dimensional 
dense  graph  (black). 

in  every  one  of  those  balls,  there  is  at  least  one  node  u  G  tU.  To  construct  the 
embedding,  we  associate  each  node  of  the  lattice  to  a  node  of  Q  whose  drawing 
appears  inside  the  ball  centered  around  the  lattice  node.  This  defines  a  injective 
function  7]  :  T zd  —■ >  V .  Consider  two  nodes  of  the  lattice  uz,vz  G  Vzd  that  have 
an  edge  between  them.  Let  u  :=  r}(uz),v  :=  rj(vz).  Since  f'(u)  and  f'{v)  belong 
to  adjacent  balls  of  unit  diameter  (see  Figure  5.5), 

df'(u,v)  =  || f'(u)  -  f'(v)  ||  <  2. 

From  Lemma  5.3.1  and  f  being  a  dense  drawing  in  M.d,  it  follows  that  dg(u,v)  < 
2a  +  [3.  for  some  positive  constants  a  and  f3.  Define  h  :=  |"2a  +  (3~\.  Then  u  and 
v  will  have  an  edge  between  them  in  the  h- fuzz  Q^h\  So  Q^h)  3  zd,  and  we  have 
the  desired  result  that  denseness  implies  (i). 

For  every  u  €  find  uz  G  as  the  node  in  the  lattice  such  that  the  ball  of 
unit  diameter  drawn  around  it  is  closest  to  u.  That  is,  find  uz  G  Vzd  such  that 

uz  =  arg  min  dist  (f(u),  B(g(u'z),  1/2))  (5.9) 
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where  dist(a:,  A)  between  a  point  x  G  Md  and  a  set  A  C  Wl  is  defined  as 

distfx, -4)  =  inf  llx  —  y\\. 

y&A 

There  are  only  2d  balls  one  needs  to  check  to  determine  the  minimum  in  (5.9),  so 
uz  exists,  though  it  may  not  be  unique.  If  there  are  multiple  minima,  pick  any 
one.  This  procedure  defines  an  onto  map  /  :  V  — >  Tzd-  Let  r\  :  1/>zd  —■ >  ^  be  the 
embedding  of  Z d  into  QA)  as  described  earlier  in  this  proof.  Define  i/j  :  A*  — >  3/  as 
:=  (77  o  £).  We  will  now  show  that,  for  every  u  G  T'’,  the  node  G  ‘T7,  which 
has  a  corresponding  node  in  the  lattice,  is  within  a  uniformly  bounded  graphical 
distance  of  u.  Since  f[u )  either  lies  in  the  ball  centered  at  g[uz )  or  in  the  gaps 
between  that  ball  and  the  neighboring  balls,  ||  f(u)  —  g(uz) ||  <  \/d,.  Therefore, 

df(u,ip(u))  <  ||  /'(«)  -g(uz)  ||  +  ||  g(uz)  -  f'(^(u))  || 

<^+^<7^^,  (5.10) 

where  we  have  used  the  fact  that  /'(//«))  G  B(g(uz),  |).  From  Lemma  5.3.1  and 
the  denseness  of  the  /-drawing  of  Q,  we  get 

dg(u,ip(u))  <  adf{u^{u))  +  f3 
=  OL'ydf>{u,il){u))  +  (3 
<  -at'y'/d  +  (3. 

Define 

c  :=  l^ajVd  +  (3],  (5.11) 

which  is  a  constant  independent  of  u  and  v.  Then  for  every  mG  ^  there  exists 
a  u  :=  i^(u)  G  77('Z/zd)  c  V  such  that  dg(u,u )  <  c,  which  is  the  desired  condition 
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(<^=)  We  have  to  prove  that  if  (i)  and  (ii)  are  satisfied,  then  Q  is  dense  in  Wl.  We 
will  construct  a  drawing  /  of  Q  in  M,d  with  the  following  procedure  and  then  prove 
that  it  is  a  dense  drawing.  Since  Z  C  Q^h\  there  is  an  injective  map  rj  :  Tzd  — >  ^ 
such  that  77  ('P zd)  C  V .  Pick  a  node  u  in  V  that  has  not  been  drawn  yet.  By  (ii), 
there  exists  a  positive  constant  c  and  a  node  uz  G  Vzd  such  that  u  :=  rj(uz)  G  V 
and  dg(u,u)  <  c.  If  u  has  not  been  drawn  yet,  then  draw  it  the  location  of  its 
corresponding  lattice  node,  i.e., 


f{u)  =  g(uz).  (5.12) 

A  little  thought  will  reveal  that  if  u  has  been  drawn  already,  as  long  as  the  drawing 
procedure  outlined  so  far  is  followed,  it  must  have  been  drawn  on  the  lattice 
location  g(uz ),  so  (5.12)  holds.  Once  u  is  drawn,  we  draw  u  in  the  following  way. 
In  case  u  —  u,  drawing  of  u  is  determined  by  the  drawing  of  u.  If  u  7^  u,  draw  u 
by  choosing  a  random  location  inside  an  open  ball  of  diameter  1  with  the  center 
at  f{u).  To  show  that  a  drawing  obtained  this  way  is  dense,  first  note  that  the 
largest  uncovered  diameter  7  <  2  since  a  subset  of  the  nodes  of  V  occupy  the 
lattice  node  positions.  Pick  any  two  nodes  u,v  G  eU.  Again,  from  (ii),  we  know 
that  there  exists  u,v  G  ^('PzJ  C  such  that  dg(u,u)  <  c  and  dg(v,v )  <  c  for 
some  positive  constant  c.  Therefore 


dg(u,  V )  <  dg(u,  U )  +  dg(u,  V )  +  dg(v,  V ) 
<  2c  +  h  dg(h)(u,  v) 


Since  Zd  C  Q^h\ 


dG(h)(u,v)  <dZd(ri  1(u),rj  \v)) 
=  \\g{uz);~  g(vz)\h 
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where  ||  ■  ||i  denotes  the  vector  1-norm, 

<  Vd\\g(uz)  -  g(vz)\\ 

=  Vd\\f(u)  -  /(h))  ||  (from  (5.12)) 

=  Vddf(u ,  v). 

Because  of  the  way  the  drawing  /  is  constructed,  we  have  df(u,u )  <  1,  which 
implies  df(u,u )  <  df(u,u )  +  df(u,v)  +  df(v,v)  =  df(u,v)  +  2.  So  we  have 

dg(u ,  v)  <  2 c  +  hVd  ( df(u ,  v)  +  2) 

=  2(c  +  hVd)  +  /?Vd  d/(w,  n). 

From  Lemma  5.3.1,  we  see  that  the  asymptotic  distance  ratio  p  >  0  for  the  /- 
drawing  of  Q,  which  establishes  that  /  is  a  dense  drawing  of  Q  in  It  follows 
that  Q  is  dense  in  M.d. 


To  prove  the  relationship  (5.1)  for  any  dense  drawing  /,  consider  again  the  scaled 
drawing  f  defined  as  f  =  //y,  so  that  the  maximum  uncovered  diameter  of  f 
is  1.  Since  Q  is  dense  in  Rd,  can  be  embedded  in  Q(h>  with  an  embedding 
rj  :  Tz(i  — >  “V .  We  choose  the  embedding  r)  as  described  in  the  first  part  of  the 
proof.  For  every  u  G  T’,  call  uz  :=  £(u),  where  £  :  'L*  — >  Tzd  was  defined  earlier 
in  this  proof  for  the  f  dense  drawing  of  Q.  Now  consider  two  arbitrary  nodes 
ti,  v  £  V  and  let  uz  :=  £(w),  :=  £(v)  (see  Figure  5.6).  It  was  shown  earlier  in 

this  proof  that  for  every  pair  of  nodes  u,v  G  T’,  we  have  dg(u,r}(uz ))  <  c  and 
dg(v,r}(vz))  <  c,  where  c  is  defined  in  (5.11). 

Now, 


dZd(uz,vz)  =  || g(uz)  -  g(vz) ||i 

<  Vd\\g(uz)  -  g(vz) ||, 
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Figure  5.6.  Natural  drawing  of  the  2-D  lattice  (brown)  superimposed  on  the  f 
drawing  of  Q.  Edges  are  not  shown  to  prevent  clutter.  In  this  example,  u  =  u  but 
v  7^  v. 

and 


\\g{uz)  -  g(vz) ||  <  | \g(uz)  -  f\u)\\  +  ||  f'(u)  -  f\u) ||  + 

||/,(«)-/,(r;)||  +  ||/»-/,(t;)||  + 

\\f\v)  -  g{vz)\\. 

We  know  that  || g{uz)  -  f'{u)\\  <  \  <  ^  since  f'{u)  G  B(g(uz),  ±),  and  ||  f'{u)  - 
f'(u)\\  <  | \fd  from  (5.10).  Using  these  in  the  above,  we  get 

\\g(uz)  -g(vz)\\  <4 Vd  +  df>(u,v), 

=>dZd{uz,vz)  <4 d+  — df(u,v ) 

7 

which  is  the  desired  result.  ■ 


Proof  of  Theorem  5.4-2.  We  will  denote  by  g  :  <Vzd  — >  the  natural  drawing  of 

the  lattice  Z^. 
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=>■  Since  Q  is  sparse  in  Md,  there  is  a  drawing  function  /  :  V  — >  that  produces  a 
civilized  drawing  of  Q  with  minimum  node  distance  s  >  0  and  maximum  connected 
range  r  <  oc.  Consider  a  new  drawing  f  :  *1/  — >  IRrf  of  dehned  as 


/'(«)  =  ^/(ll),Vu  6  V. 


(5.13) 


The  minimum  node  distance  and  the  maximum  connected  range  in  this  drawing 


are 


/  n  /  \/d 

s  =  Va,  r  = - r. 

s 

Superimpose  the  two  drawings  g(Z^)  and  f\Q )  (cf.  figure  5.7).  In  every  lattice 
cell3  in  Md,  there  is  at  most  one  node  of  Q,  for  if  there  are  two,  then  in  an  open 
ball  of  diameter  \fd,  in  Md,  there  are  two  points  f'{u )  and  f'(v )  where  u,v  £  ‘V 1 
which  violates  the  condition  that 


s' =  hif  ||/' (u)  -  f(v)\\  =  Vd. 

U^=V 

UjVE'lS 

Dehne  a  mapping  r/  :  1/  by  associating  every  node  u  £  V  to  the  lattice 

vertex  with  the  most  negative  coordinate  in  the  lattice  cell  where  f{u)  lies  (cf. 
Figure  5.7).  Since  a  lattice  cell  contains  the  f  drawing  at  most  one  node,  rj  is 
injective.  Let  uz  :=  r}(u)  and  vz  :=  rj(v)  .  Since  f{u)  and  g(uz )  lie  in  the  same 
lattice  cell, 

II f\u)  -  g(uz)  ||  <Vd,  Vtt  £  V.  (5.14) 


3 A  lattice  cell  is  taken  as  a  unit  semi  open  hypercube  in  Rd,  which  is  a  subset  of  Rd  of  the 
form  [tu,  ai  +  1)  x  [ ci2 ,  ci2  +  1)  •  •  •  x  [a^,  ad  +  1)  for  some  ai,  d2, . . . ,  €  R. 
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Figure  5.7.  A  graph  that  can  be  drawn  in  a  civilized  manner  in  M2  can  be  em¬ 
bedded  in  a  fuzz  of  the  2-D  lattice. 

So, 

dzd(uz,vz)  =  || g(uz)  -  g(vz)\h 

<  Vd\\g(uz)  —  g(vz)\\ 

<  Vd(\\g(uz)  -  f'{u) ||  +  ||  f'{u)  -  f\v) || 

+  ll/»-9(«,)ll) 

<  Vd  ^2 Vd  +  df'ta, 

If  (u,v)  G  E ,  df(u,v )  <  r'  —  —r,  where  r'  is  the  maximum  connected  range  in 
the  f  drawing  of  Q,  so  we  get 

dzd(uz,vz )  <  2d  H — r  <  oo 

Dehning  h  :=  \2cl  +  jr~],  we  see  that  there  is  an  edge  between  uz  and  vz,  i.e., 
between  ?/(n)  and  g(v),  in  Z^b  This  proves  that  g  is  an  embedding,  so  Q  C  Zj^'b 

Since  Q  C  for  some  h  <  oo,  there  is  an  embedding  g  :  1/  — »  of  Q  in 
Z a-  Consider  a  drawing  /  of  Q  defined  as 

f(u)  =  g(ri(u)),  Vu  G  'Tb 
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We  immediately  get  s  >  1  >  0  for  this  drawing.  If  (u,  v)  G  “E , 

\\M-f(v)\\  =  \\g(v(u))-g(v(v))\\ 

<  \\9(v(u))  -  g(v{v))\\i 
=  dzd(,v(u),rj(v))  <  h, 

where  the  last  inequality  follows  because  Q  C  Z^.  Therefore  the  maximum 
connected  range  r  in  the  drawing  /  of  Q  satisfies  r  <  h  <  oo.  The  drawing  of  Q 
specified  by  /  is  therefore  a  civilized  drawing  in  Md,  from  which  it  follows  that  Q 
is  sparse  in  Wl. 

To  prove  the  relation  (5.2),  we  go  back  to  the  drawing  f  defined  in  (5.13)  based 
on  the  civilized  drawing  /  of  Q .  Since  Q  is  sparse  in  Q  C  for  some  positive 
integer  h.  Let  r/  :  1/  be  the  embedding  of  Q  into  Z^.  Denoting  uz  :=  rj(u ), 

for  any  u,  v  G  V  we  get 

dzd(uz,vz)  =  || g(uz)  -  g(vz) ||i 

>  \\g{uz)  -g{vz)\\- 

Since  || f'(u)  -  f(v)\\  <  \\f'{u )  -  g{uz) ||  +  ||^(m*)  -  g{vz)\\  +  || g(vz)  -  f'(v) ||,  we 
get  from  the  above  that 

dzd(uz,vz)  >  J|  f\u)  -  f{y)  ||  -  \\f(u)- g(uz)\\  -  ||  g(vz)  -  f(v)  || 

>  df(u,  v )  —  2  Vd,. 

where  we  have  used  that  fact  that  both  f'(u )  and  g{uz )  lie  in  the  same  lattice  cell 
for  every  u  G  V .  Since  df(-)  =  df{ -)yy,  the  result  follows.  ■ 

Proof  of  Lemma  5-4.1.  We  only  provide  the  proof  that  the  2-dimensional  lattice  is 
not  sparse  in  R  and  is  not  dense  in  M3.  The  general  case  for  arbitrary  dimensions 
is  analogous. 
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To  prove  by  contradiction  the  lack  of  denseness,  assume  that  there  exists  a  dense 
drawing  /  of  Z2  in  R3,  with  associated  7  <  00  and  p  >  0.  Fix  the  origin  of  M3 
at  f{u)  for  an  arbitrary  node  u  in  the  lattice  Z2.  For  an  arbitrary  D  >  0,  the 
volume  of  the  sphere  in  R3  centered  at  the  origin  with  diameter  D,  denoted  by 
B3(0,D)  is  Q(D3).  Therefore  the  number  of  nodes  of  Z2  drawn  inside  B3(0,D) 
is  fi((-y)3)  =  ff(Z>3).  It  is  straightforward  to  show  that  for  any  set  of  n  distinct 
nodes  in  the  lattice  Z2,  the  maximum  graphical  distance  between  any  two  nodes  in 
the  set  is  {l(sjn).  Therefore  the  maximum  graphical  distance  between  the  nodes 
in  B3(0,D)  is 

The  maximum  Euclidean  distance  between  any  two  nodes  drawn  inside  the  sphere 
£>3(0,.D)  under  the  /-drawing  is  at  most  D,  and  since  /  is  a  dense  drawing, 
it  follows  from  Lemma  5.3.1  that  for  every  pair  of  nodes  u,v  in  Z2  such  that 
/(«),/( V)  6  i33(0,£>),  we  have  dc(u,v )  <  aD  +  b.  Therefore,  the  maximum 
graphical  distance  between  pairs  of  nodes  whose  drawing  falls  inside  B3(0,D)  is 
O(D),  as  well  as  Q(D^),  which  is  a  contradiction  for  sufficiently  large  D.  Hence 
no  dense  drawing  of  Z2  in  R3  is  possible. 

To  show  Z2  is  not  sparse  in  R,  assume  that  there  exists  a  civilized  drawing  of  Z2  in 
R  with  s  >  0  and  r  <  00,  where  r  and  s  are  constants.  Consider  a  subgraph  Z2(n) 
of  Z2  that  consists  of  all  nodes  within  a  Euclidean  distance  n  from  the  origin.  The 
total  number  of  nodes  in  this  finite  subgraph  is  h2(n2).  The  length  of  the  interval, 
L ,  in  which  the  nodes  of  this  subgraph  are  located  in  the  sparse  1-d  drawing  of 
Z2  is  clearly  L  =  fl(sn2).  Since  the  maximum  graphical  distance  between  any 
two  nodes  in  the  subgraph  Z2(n)  is  n  by  construction,  the  maximum  connected 
range  in  the  1-d  drawing  must  be  at  least  r  >  ^  —  Q(sn).  Since  this  must  be  true 
for  every  n,  r  cannot  be  a  finite  constant.  Thus,  no  civilized  drawing  of  Z2  in  R 
exists.  ■ 


198 


Proof  of  Lemma  5.5.1.  Due  to  Lemma  4.6.1,  we  need  to  prove  the  result  only  for 
the  case  of  scalar- valued  unit  resistors.  Let  Q  =  be  a  connected  graph 

with  an  unit  resistor  on  every  edge.  Let  Q E  =  ( ‘V .  EE )  be  the  h- fuzz  of  Q.  We 
assign  an  unit  resistance  to  every  edge  of  EE .  The  edge  set  EE  consists  of  two 
disjoint  subsets  E  and  Eh,  where  Eh  is  the  set  of  “new”  edges  in  QE  that  were 
not  there  in  Q.  That  is,  E E  =  E  U  Eh  and  E  fl  Eh  =  0.  To  every  edge  e  G  Eh, 
there  corresponds  a  path  Ve  of  length  <  h  in  Q .  See  figure  5.8(A)  and  (B) 
for  example.  Replace  every  edge  e  G  Eh  by  a  series  of  ie  edges,  each  with  unit 
resistance,  and  call  the  resulting  graph  QE.  We  introduce  new  nodes  in  doing  so. 
To  every  one  of  these  new  nodes  in  QE  ^  there  corresponds  a  node  in  Q  (see  figure 
5.8(C)).  by  Rayleigh’s  monotonicity  law,  the  effective  resistance  has  increased: 
PfEfQE'j  >  where  u  is  a  node  in  QE  that  corresponds  to  u  in  QE . 

However,  since  we  have  increased  the  resistance  of  any  edge  by  no  more  than  a 
factor  of  h.,  the  increase  in  effective  resistance  is  no  more  than  a  factor  of  h: 

(5.15) 

Now  for  every  edge  e  of  Eh,  look  at  the  corresponding  series  of  resistors  in  EE. 
Its  endpoints  lie  in  the  original  graph  Q.  Take  its  intermediate  vertices  and  short 
them  to  vertices  of  QE  along  the  path  Ve.  We  do  this  for  every  edge  e  G  Eh  and 
call  the  resulting  graph  Q' .  Again  due  to  Rayleigh’s  monotonicity  law, 

<„.(£')  <  (W>),  (5.16) 

where  u'  denotes  the  node  in  Q'  that  corresponds  to  u  in  QE.  The  graph  Q' 
differs  from  Q  only  in  having  extra  parallel  edges  between  its  nodes  (figure  5.8(D)). 
However,  the  number  of  edges  in  Q'  that  are  parallel  to  an  edge  e  in  Q  are  no  more 
than  the  number  of  paths  in  Q  of  length  at  most  h  that  traverse  the  edge  e.  Since 
Q  is  a  bounded  degree  graph,  there  is  an  upper  bound  to  this  number.  Let  rj  be 


199 


B 


C 


D 


Figure  5.8.  Fuzzing  doesn’t  change  the  effective  resistance  too  much. 

this  upper  bound.  It  is  easy  to  see  from  Parallel  Resistors  Proposition  4.6.1  that 
if  every  edge  of  a  graph  -  that  has  unit  resistance  on  every  edge  -  is  replaced 
by  r)  parallel  edges  and  every  one  of  the  new  edges  have  unit  resistance,  then  the 
effective  resistance  in  the  new  graphs  lower  than  that  in  the  original  graph  by  a 
factor  of  r).  Combining  with  Rayleigh’s  monotonicity  law,  we  get 


1 


(5.17) 


V 


Equations  (5. 15), (5. 16)  and  (5.17)  give  us 


(5.18) 


which  proves  the  result. 


200 


Part  II 


Control  with  Relative 
Measurements 
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Chapter  6 


Decentralized  formation  control: 
effective  resistance  vs.  scalability 

6.1  Introduction 

In  this  chapter  we  consider  the  problem  of  formation  control  by  a  group  of  au¬ 
tonomous  agents.  A  formation  is  specified  in  terms  of  a  desired  relative  positions 
between  agents.  Each  agent  can  measure  its  relative  position  with  only  a  limited 
number  of  nearby  agents.  The  task  for  each  agent  is  to  take  control  actions,  e.g., 
modify  its  acceleration  and/or  velocity,  using  only  the  locally  available  informa¬ 
tion,  such  that  the  group  attains  its  collective  goal  of  maintaining  the  desired 
formation. 

Motivation  for  studying  formation  control  problems  arise  from  its  relevance  to 
diverse  applications,  from  military  surveillance  to  swarming  in  nature.  Keeping 
a  formation  among  a  group  of  vehicles  is  important  in  certain  military  applica¬ 
tions  where  sensor  assets  are  limited.  In  that  case,  individual  team  members 
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can  concentrate  their  sensors  across  a  limited  portion  of  the  environment,  while 
the  team  as  a  whole  is  still  able  to  sense  the  whole  environment  [23].  A  group 
of  aerial  vehicles  can  reduce  drag  by  maintaining  their  relative  positions  at  spe¬ 
cific  values  [18,  19].  Similarly,  capacity  of  highways  can  be  improved  by  if  large 
groups  of  vehicles,  called  platoons,  can  move  in  formation  maintaining  a  small 
inter- vehicular  separation  (see  [6]  and  references  therein).  Yet  another  applica¬ 
tion  of  formation  control  is  interferometric  imaging  by  a  formation  of  satellites 
that  can  lead  to  a  higher  degree  of  accuracy  than  what  is  possible  by  a  single 
satellite  [24,  124], 

Study  of  formation  control  is  useful  in  understanding  biological  systems  as 
well.  Several  species  of  birds  and  many  species  of  fish  and  aquatic  animals  are 
known  to  exhibit  “swarming” ,  which  loosely  means  some  form  of  aggregate  motion 
by  a  group  as  a  whole.  A  few  example  of  such  swarming  are:  pattern  forming  by 
schools  of  fish  [125],  synchronized  predation  of  cope-pods  by  juvenile  herring  [126], 
moving  in  formation  by  spiny  lobsters  to  reduce  drag  [20]  and  V-formation  flying 
by  birds  (apparently  to  improve  lift  [22]  or  for  better  visual  cues  [127]). 

The  control  action  taken  by  an  agent  necessarily  depends  on  the  states  of  the 
other  agents  in  order  to  maintain  the  formation.  Thus,  the  dynamics  of  individual 
agents  become  coupled,  or  interconnected,  which  can  be  described  by  a  graph.  The 
nodes  of  the  graph  are  the  agents  and  the  edges  are  the  node  pairs  whose  states 
appear  in  each  other’s  control  algorithms.  We  call  this  graph  the  interconnection 
graph. 

The  interconnection  graph  depends  on  the  choice  of  control  architecture.  In  a 
centralized  control  architecture,  all  the  relative  position  measurements  are  made 
available  to  a  either  a  leader  node  or  to  every  node.  If  the  control  signals  for 
all  the  nodes  are  computed  by  a  leader,  it  transmits  those  signals  back  to  the 
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individual  nodes.  Otherwise,  every  node  computes  its  own  control  signal  from  the 
global  information  it  has  available.  In  the  former  case,  the  interconnection  graph 
will  look  like  a  “star”,  with  an  edge  (u,o)  between  every  node  u  and  the  leader. 
In  the  latter,  the  interconnection  graph  will  be  a  complete  graph,  in  which  every 
node  is  connected  to  every  other  node. 

In  contrast,  in  a  decentralized  architecture,  every  node  uses  only  the  informa¬ 
tion  that  it  can  obtain  with  either  communication  with  its  nearby  node  or  with 
on-board  sensors.  For  example,  when  every  node  uses  the  measurements  of  its  rel¬ 
ative  position  with  respect  to  its  nearby  nodes,  which  it  can  obtain  using  on-board 
sensors  such  as  radars,  the  resulting  architecture  is  decentralized.  The  intercon¬ 
nection  graph  will  have  a  highly  local  structure,  with  edges  existing  only  between 
nodes  that  are  physically  close.  A  centralized  architecture  suffers  from  a  high 
communication  overhead  compared  to  a  decentralized  one,  which  makes  decen¬ 
tralized  architecture  more  appealing,  particularly  for  large  groups  of  autonomous 
agents. 

We  will  study  decentralized  architectures  in  this  chapter,  and  focus  on  situa¬ 
tions  when  the  interconnection  graph  has  a  large  number  of  nodes.  Scalability  is 
an  important  issue,  which  refers  to  how  sensitive  the  performance  of  the  closed 
loop  is  to  the  number  of  nodes.  The  performance  metric  is  application  dependent, 
and  may  refer  to  the  stability  margin,  sensitivity  to  measurement  noise,  etc.  Typ¬ 
ically,  if  the  performance  of  the  closed  loop  is  independent  of,  or  degrades  slowly 
with,  the  number  of  nodes,  then  the  control  algorithm  is  termed  scalable.  We  will 
see  in  this  chapter  that  scalability  of  formation  control  is  as  much  a  function  of 
the  structure  of  the  interconnection  graph  as  it  is  of  the  control  algorithm. 

Since  our  focus  is  on  the  interconnection  structure,  we  consider  simple  models 
of  node  dynamics  and  simple  control  laws.  In  particular,  the  dynamics  of  each 
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node  is  modeled  as  an  integrator,  and  each  node  modifies  it  velocity  depending 
on  a  local  error  it  estimates  by  comparing  the  positions  of  its  neighbors  (relative 
to  itself)  to  the  desired  ones.  Such  node  models  and  control  laws  have  been 
investigated  extensively  in  the  literature  [29,  128]. We  will  refer  to  this  particular 
control  law  as  Laplacian  disagreement  control,  the  term  being  borrowed  due  to 
the  control  law’s  close  connection  to  the  Laplacian  disagreement  function  used 
in  [129].  Apart  from  employing  a  simple  control  law,  we  also  ignore  some  of  the 
issues  faced  in  practice,  such  as  sensing  and  communication  faults,  time  variation 
in  the  interconnection  due  to  these  and  other  reasons,  avoiding  obstacles,  etc. 

Chapter  organization:  Two  topics  are  studied  in  this  chapter,  that  of  sensitiv¬ 
ity  to  meaurement  noise  and  the  minimum  eigenvalue  of  the  Dirichlct  Laplacian. 
Each  section  introduces  a  topic,  states  the  problem  and  then  presents  the  results. 
Section  6.3  investigates  noise  sensitivity  of  formation  control  with  relative  mea¬ 
surements.  Section  6.4  describes  a  lower  bound  on  the  smallest  eigenvalue  of  the 
Dirichlet  Laplacian  with  effective  resistances,  along  with  a  summary  of  applica¬ 
tions  where  such  bound  is  useful. 


6.2  Contributions  and  prior  work 

The  results  established  in  this  chapter  are  briefly  summarized  below: 

1.  Graph  structure  and  error  propagation:  we  show  in  Section  6.3  that  the  co- 
variance  of  the  steady-state  formation  error  of  a  node  is  equal  to  the  matrix¬ 
valued  effective  resistance  between  the  node  and  the  reference  in  an  abstract 
electrical  network  constructed  from  the  architecture  graph.  The  formation 
error  of  a  node  is  the  difference  between  its  relative  position  w.r.t.  a  reference 
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and  the  desired  relative  position.  The  matrix-valued  effective  resistance  was 
introduced  in  Chapter  4.  It  was  shown  in  Chapter  5  that  graph  structure 
has  a  huge  impact  on  the  effective  resistance.  This  analogy  with  electrical 
networks  show  that  the  performance  of  the  Laplacian  disagreement  control 
is  quite  sensitive  to  the  structure  of  the  interconnection  graph. 

2.  Bound  on  the  Dirchlet  Laplacian  spectrum:  In  Section  6.3  we  derive  a  lower 
bound  on  the  minimum  eigenvalue  of  the  Dirichlet  Laplacian  of  a  matrix- 
weighted  graph  in  terms  of  the  effective  resistances.  The  Dirichlet  Laplacian 
was  introduced  in  Chapter  2.  The  stability  margin  of  the  closed  loop  forma¬ 
tion  depends  on  this  eigenvalue  of  the  interconnection  graph.  In  addition, 
convergence  rate  of  the  Jacobi  algorithm  described  in  Chapter  3  also  depends 
on  this  eigenvalue  of  the  measurement  graph.  To  determine  the  performance 
of  these  algorithms,  we  need  a  lower  bound  on  this  eigenvalue;  whereas  few 
results  are  available  in  the  literature  on  lower  bounds  on  the  Dirichlet  Lapla¬ 
cian  spectrum.  The  bound  derived  here  in  terms  of  the  effective  resistances 
is  useful  when  bounds  on  the  effective  resistance  can  be  derived.  This  is 
possible  for  certain  classes  graphs  without  complete  knowledge  of  the  graph 
itself,  when  graph  embedding  techniques  can  be  used  to  relate  them  to 
graphs  with  known  effective  resistance,  as  described  in  Chapters  4  and  5. 

We  see  from  the  discussion  above  that  the  generalized  effective  resistance, 
introduced  in  Chapter  4  in  connection  with  estimation  problems,  has  a  potential 
for  fruitful  use  in  control  problems  as  well,  especially  in  multi-agent  coordination 
problems  that  are  posed  in  terms  of  matrix-weighted  graphs. 

Prior  work:  Although  the  Laplacian  disagreement  control  has  been  examined 
extensively  in  connection  with  consensus  algorithms  (see,  e.g.,  [29,  30,  129]),  the 
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effect  of  interconnection  structure  on  its  performance  has  received  scant  attention. 
Except  in  the  special  case  of  ID  formations  of  automated  vehicular  platoons,  the 
effect  of  disturbances  on  spacing  errors  have  not  been  thoroughly  investigated. 
The  platoon  problem  will  be  discussed  extensively  in  the  following  two  chapters. 

Various  forms  of  formation  control  using  relative  measurements  have  been 
examined,  including  behavior  based  approaches  [23],  control  using  artificial  po¬ 
tentials,  etc.  Typical  results  in  this  area  consists  of  a  control  algorithm  and  a 
proof  of  asymptotic  stability.  When  vehicle  dynamics  or  the  control  laws  are  non¬ 
linear  [130-132],  or  when  the  interconnection  graph  is  directed  [133],  establishing 
asymptotic  stability  itself  is  challenging.  So  examination  of  stability  margins, 
especially  for  large  groups  of  autonomous  agents,  yet  to  generate  much  enthusi¬ 
asm.  Although  it  has  been  recognized  that  the  convergence  rate  of  the  errors  will 
depend  on  the  interconnection  graph,  most  studies  focus  on  “leaderlcss  coordina¬ 
tion”,  in  which  the  Laplacian  eigenvalues  that  become  important  [128,  133].  This 
emphasis  is  partly  due  to  the  popularity  of  “consensus  algorithms”  and  their  close 
connection  with  formation  control  [30],  in  which  leaderlcss  coordination  is  natural. 
However,  in  contrast  to  consensus  algorithms,  in  formation  control,  the  presence 
of  a  reference  node  in  the  interconnection  graph  is  common  and  realistic,  since  an 
average  trajectory  has  to  be  specified  to  the  whole  formation  in  some  form,  which 
is  usually  done  through  either  a  lead  vehicle  or  a  virtual  leader,  such  as  in  [134], 
In  that  case,  the  Dirichlet  Laplacian  eigenvalue  is  the  critical  quantity. 

The  spectrum  of  the  Dirichlet  Laplacian  has  attracted  very  little  attention  from 
graph  theorists.  Except  for  a  few  results  in  [81],  most  work  has  focused  on  the 
Laplacian  spectra.  The  minimum  eigenvalue  of  the  Dirichlet  Laplacian  is  smaller 
than  the  second  smallest  eigenvalue  of  the  graph  Laplacian,  the  latter  being  also 
called  the  algebraic  connectivity.  Therefore,  bounds  on  the  algebraic  connectivity, 
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on  which  extensive  literature  exists,  are  not  useful  in  lower  bounding  the  minimum 
eigenvalue  of  the  Dirichlet  Laplacian  eigenvalues.  The  effective  resistance  based 
lower  bound  is  therefore  quite  useful,  especially  for  graphs  for  which  the  order 
estimates  of  the  effective  resistances  are  known. 


6.3  Formation  control  with  noisy  measurements 

Consider  a  group  of  N  mobile  nodes  moving  in  fc-dimensional  space.  One 
of  the  objectives  of  the  group  is  to  maintain  a  pre-specified  formation  defined 
by  the  relative  positions  between  nodes.  In  particular,  denoting  by  xu  G  Rfc, 
u  G  V  :=  {1,  2, ... ,  N}  the  position  of  the  i/dli  node,  the  control  objective  is  to 
make  the  positions  converge  to  values  for  which 

xu-xv  =  rUtV,  \/(u,v)  e'l'x'l',  (6.1) 

where  rU)V  denotes  the  desired  relative  position  of  node  u  with  respect  to  node  v. 

Not  all  nodes  are  able  to  measure  their  relative  positions  with  respect  to  all 
other  nodes  and  therefore  each  node  is  constrained  to  use  only  a  few  relative 
position  measurements  to  compute  its  control  signal.  We  denote  by  !E  C  ‘T’  x  *]/ 
the  set  of  ordered  pairs  of  nodes  that  can  measure  their  relative  positions.  In 
particular,  the  existence  of  a  pair  ( u ,  v)  in  “E  signifies  that  node  u  can  measure 
its  position  with  respect  to  v.  Thus  the  group  is  formally  described  in  terms 
of  a  directed  graph  Q  =  (T/,)  “E),  whose  nodes  represents  the  nodes  and  whose 
edges  represent  the  pairs  of  nodes  that  have  access  to  a  relative  measurement.  We 
assume  that  a  noisy  relative  measurement  yU:V  of  the  following  form  is  available 
to  node  u  if  (u,v)  G  “E: 


208 


where  eU;V  is  a  white  random  noise  process  with  auto-covariance 

E[e^(ti)e^(t2)]  =  S(ti  -  t2)Ru,v  (6.3) 

One  of  the  nodes  o  G  ‘V  will  be  called  the  reference  and  it  will  move  independently 
of  the  remaining  ones.  The  remaining  nodes  attempt  to  maintain  the  formation 
specified  by  (6.1).  The  reference  node  may  or  may  not  be  be  a  physical  agent.  It 
may  be  a  virtual  reference  that  is  known  to  at  least  one  of  the  physical  agents.  In 
case  x0  is  not  a  physical  agent,  an  edge  between  the  node  u  and  the  reference  o 
means  that  the  agent  u  is  able  to  measure  its  position  with  respect  to  the  reference 
o.  Or,  the  reference  node  maybe  a  proxy  for  the  moving  frame  of  reference  if  all 
the  nodes  are  moving  at  a  constant  velocity. 

We  assume  the  following: 

Assumption  6.3.1.  1.  The  graph  Q  is  time-invariant. 

2.  If  (u,  v)  G  £,  then  (v,u)  G  “E. 

3.  The  noise  processes  over  different  edges  are  independent  of  each  other,  i.e., 
eu,v(t)  is  independent  of  ev^u(t)  for  all  t  G  M+. 

4.  Even  though  the  measurement  errors  on  the  two  edges  (u,v)  and  (v,u) 
connecting  the  nodes  u  and  v  are  uncorrelated,  they  have  the  same  auto- 
covariance  matrix;  i.e.,  Ru^v  =  RViU. 

For  formation  control  problems  the  first  assumption  is  not  restrictive  since  the 
desired  formation  is  usually  time  invariant.  The  second  assumption  says  that  if 
a  measurement  yu,v  is  available  to  u,  then  the  measurement  yv,u  is  available  to  v , 
although  both  measurements  will  be  corrupted  with  noise.  Since  the  noise  cor¬ 
rupting  the  measurement  of  xu—xv  available  to  u  will  be  (in  general)  different  from 
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the  noise  on  the  measurement  of  xv  —  xu  available  to  v,  these  two  measurements 
are  distinct.  The  assumption  that  the  noises  affecting  the  measurements  on  the 
edge  (u,  v )  has  the  same  auto-covariance  as  the  noise  on  (v,  u )  is  likely  to  be  sat¬ 
isfied  when  they  use  similar  sensors.  The  assumption  that  whenever  ( u ,  v )  exists, 
(v,u)  also  exists  may  fail  to  hold  in  certain  situations,  such  as  when  one  node’s 
sensor  fails.  We  will  refer  to  the  assumption  that  one-way,  asymmetric  measure¬ 
ment  never  takes  place,  together  with  the  assumption  that  the  noise  covariances 
on  parallel  edges  are  equal,  as  symmetric  measurement.  Fig.  6.1  shows  an  example 
of  such  a  symmetric  directed  graph.  In  the  terminology  of  Section  2.2.2,  Q  is  a 
matrix- weighted  graph  with  edge  weights  specified  by  a  function  W  :  “E  — >  Sfc+, 
where  WUjV  =  R~*. 

Our  goal  is  to  determine  how  the  structure  of  the  graph  and  the  measurement 
noises  affects  the  formation  error.  The  formation  error  of  a  node  u  G  “V  is  defined 
as 


eu{t)  =  xu{t)  -  x0(t)  -  rUiQ.  (6.4) 

where  rUj0  is  the  desired  relative  position  of  u  w.r.t.  the  reference  o.  Due  to  mea¬ 
surement  noise,  this  error  will  be  random,  so  we  also  look  at  its  auto-covariance: 

Su,o  =  E  [(eu{t)  -  E [eu(t)])(eu(t)  -  E[eu(t)])T]  .  (6.5) 

We  use  a  control  law  in  which  each  node  uses  all  its  measurements  to  construct 
an  optimal  estimate  of  the  difference  between  its  currently  position  and  what 
this  “should”  be,  in  view  of  what  it  know  about  its  neighbors  positions.  The 
measurements  available  to  an  arbitrary  node  u  G  are 

Vu,v  -Eu  T  £u,Vl  G  A/)i, 
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Figure  6.1.  A  symmetric  interconnection  graph,  its  generalized  incidence  matrix 
A,  and  the  matrix  B. 

where  J\fu  C  V  denotes  set  of  nodes  v  such  that  (u,v)  G  ‘E.  If  node  u  assumes 
that  all  its  neighbors  are  correctly  positioned  then,  according  to  (6.1),  the  desired 
position  of  u  is  given  by  any  one  of  the  following  equations 

Xu  Xv  T u,vi  \/v  G  J\fw 


Combining  the  two  previous  sets  of  equations,  we  obtain 

Uu,v  %u  %u  "F  fu,v  A  ^u,vr  Fu  G  Mui 

from  which  nodes  u  estimates  its  position  error  xu  —  x^.  It  is  straightforward  to 
show  that  the  best  linear  unbiased  estimate  of  xu  —  x f  is  given  by 


IT 1  V  R-1  (v  -  r  ) 

/  ,  1Lu,v\yu,v  ' u,v)l 


veATu 

where  Du  :=  Yhv&ATu  Ru  1 •  This  motivates  the  following  negative  proportional 
control  law  for  the  nodes 


xu  =  -7 Du  1  ^2  R2v(Vu,v  -  rU)V),  VmG^)  {o},  (6.6) 

veATu 

where  7  denotes  some  positive  number. 

For  analysis  purposes  it  is  convenient  to  describe  the  system  dynamics  in  term 
of  positions  with  respect  to  the  reference.  Defining  xu  =  xu  —  xa,  one  concludes 
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that 


%ll  T U,V  € U,V )  ^0  5 

veMu 

for  every  u  G  V  \  {o}.  By  stacking  all  the  positions  xu,  u  G  V  \  {o}  in  a  column 
vector  x,  the  above  systems  can  be  written  as  follows: 

x  =  — yM^Ex  +  7M_1B>V(r  —  e)  —  xal,  (6.7) 

where  r  is  a  column  vector  obtained  by  stacking  all  the  rUjV  on  top  of  each  other; 
e  is  a  column  vector  obtained  by  stacking  all  the  eUjV;  1  is  a  n  —  1  x  1  column 
vector  of  all  l’s;  W  >  0  is  a  block-diagonal  matrix  with  k  rows/columns  for  each 
edge  in  E,  with  the  weights  WU)V  :  =  Rfl,  (u,v)  G  E  in  the  diagonal;  M  >  0 
is  a  block-diagonal  matrix  with  k  rows/columns  for  each  node  in  ‘V  \  {o},  with 
Du ,  u  e  *1/  \  {o}  as  defined  earlier  in  the  diagonal;  L  =  ^AhWAj  where  A],  is 
the  generalized  basis  incidence  matrix  for  the  directed  graph  (‘l/>,  E);  and  B  is  a 
matrix  with  k  rows  for  each  vertex  in  ‘V  \  {o}  and  k  columns  for  each  edge  in 
“E,  constructed  as  follows:  the  k  columns  corresponding  to  edge  ( u ,  v)  G  E  are 
all  equal  to  zero  except  for  the  block  corresponding  to  the  node  u,  which  is  equal 
to  Jfc.  The  white  noise  process  e  has  block  diagonal  auto-covariance  matrix  given 
by  E[e(fi)eT(f2)]  =  i  —  t2)TV~1.  Figure  6.1  shows  an  example  of  the  matrices 
defined  above. 

The  compact  form  of  the  closed  loop  dynamics  (6.7)  explains  the  terminology 
Laplacian  disagreement  control  -  because  of  the  appearance  of  the  Dirichlet  Lapla- 
cian  L.  Similar  control  laws  have  been  studied  in  the  literature  on  multi-vehicle 
control  and  multi-agent  consensus  [29,  30,  128]. 

The  main  result  of  this  section  is  the  following: 

Theorem  6.3.1.  Consider  the  problem  of  formation  control  with  a  finite  number 
of  mobile  nodes,  described  by  the  directed  graph  Q  and  a  function  R  :  E  — >  §fc+ 
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that  describes  the  auto-covariance  of  the  measurement  error  process  (6.3),  in  which 
every  node  implements  the  Laplacian  disagreement  control  law  (6.6).  When  as¬ 
sumption  6.3.1  holds,  the  closed  loop  is  stable  irrespective  of  the  number  of  vehi¬ 
cles,  and  the  steady  state  covariance  matrices  for  the  formation  error  for  node  u 
is  given  by 


^-‘U,0  t  H 


eff 


where  Re^0  is  the  matrix-valued  effective  resistance  between  u  and  o  in  the  gener¬ 
alized  electrical  network  ( Q,R ).  □ 


The  scaling  of  matrix-valued  effective  resistance  as  a  function  of  distance 
du>0  of  u  from  the  reference  o  determine  how  the  structure  of  the  graph  Q  affects 
the  growth  of  the  effective  resistance,  and  therefore  formation  error  covariance. 
Effective  resistances  in  graphs  were  studied  extensively  in  Chapter  5,  was  it  was 
identified  that  the  structure  of  the  graph  greatly  affects  the  effective  resistances. 
In  sparse  graphs,  the  effective  resistance  grows  fast  with  distance  from  the  refer¬ 
ence,  whereas  in  dense  networks  it  grows  slowly.  In  view  of  Theorem  6.3.1,  this 
dependence  of  effective  resistance  on  graph  structure  has  significant  implications 
for  the  problem  of  formation  control,  which  are  discussed  below. 


6.3.1  Implications  for  man-made  autonomous  agents 

The  preceding  discussion  shows  that  the  maximum  tracking  errors  in  two  net¬ 
works  consisting  of  the  same  number  of  agents  can  be  quite  different.  As  a  result, 
some  networks  are  more  scalable  than  others  in  terms  of  tracking  performance. 
This  knowledge  can  be  used  for  designing  networks  that  are  formed  by  groups  of 
mobile  autonomous  agents,  such  as  UAVs.  Frequently,  the  formation  structure  of 
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such  agents  is  designed  solely  on  the  basis  of  the  task  that  the  group  is  expected 
to  perform.  However,  our  results  show  that  a  formation  structure  itself  imposes 
fundamental  limitations  on  how  well  that  formation  can  be  maintained  by  the 
agents.  Thus,  if  the  agents  are  required  to  maintain  their  formation  accurately, 
then  the  desired  formation  itself  has  to  be  appropriately  chosen.  For  example,  it 
will  be  unwise  to  ask  a  large  group  of  agents  to  fly  in  a  single  line  while  main¬ 
taining  very  accurate  spacings  between  neighbors,  since  we  know  that  in  such  a 
graph  the  tracking  error  grows  linearly  with  the  number  of  agents. 

6.3.2  Implications  for  Swarming  in  Nature 

While  the  exact  nature  of  motion  coordination  among  biological  agents  is 
still  a  mystery,  the  control  law  (6.6)  is  nevertheless  an  approximation  of  the  mo¬ 
tion  coordination  schemes  that  are  proposed  to  explain  swarming  behavior  in 
animals  [135,  136].  This  control  law  is  extremely  simple  and  requires  only  in¬ 
formation  about  nearby  agents,  which  can  be  obtained  by  animals  through  their 
vision  and/or  auditory  sensors.  Moreover,  measurement  noise  is  likely  to  affect 
the  relative  position  estimates  as  modelled  in  (6.2). 

The  analogy  between  effective  resistance  and  formation  error  covariance  can 
explain  a  number  of  puzzling  observations  from  nature.  For  example,  it  is  well 
know  that  several  species  of  birds  fly  in  a  “V”-formation  (cf.  Fig.  6.2(a)).  Although 
why  this  happens  is  still  a  matter  of  debate  (both  drag  reduction  and  better  visual 
cue  have  been  offered  to  explain  this  phenomenon;  see  [22]  for  arguments  for  the 
former  and  [127]  for  arguments  against  the  former  and  for  the  latter),  it  is  observed 
that  the  birds  close  to  the  leader  can  maintain  their  relative  positions  better  than 
the  birds  toward  the  end  of  the  arms  of  the  “V”.  An  example  is  shown  in  the 
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(a)  A  flock  of  birds  in  “V”-  (b)  A  school  of  fish 

formation 

Figure  6.2.  Examples  of  1-D  and  3-D  network  topologies  in  natural  swarms. 
Photograph  in  (b)  courtesy  Sergey  Parinov  (http://www.sergeyphoto.com) 

photograph  of  Figure  6.2(a).  Such  large  formation  error  in  bird  flocks  might 
be  explained  by  the  fact  that  a  V-formation  is  sparse  in  1-dimension  and  hence 
the  effective  resistance  are  large  when  the  number  of  birds  in  the  flock  is  large 
(see  Theorem  5.3.1).  On  the  other  extreme,  schools  consisting  of  millions  of  fish 
are  known  to  move  together  in  a  3-dimensional  structure  in  a  surprisingly  agile 
fashion  [137].  In  a  large  school  of  fish  such  as  the  one  shown  in  Fig.  6.2(b),  the 
interconnection  topology  is  3-dimensional.  It  is  not  hard  to  see  that  the  network  in 
such  a  large  school  will  be  dense  in  3-D,  if  we  image  it  as  being  a  part  of  an  infinite 
graph.  Hence  the  tracking  error  variance  of  the  individuals  remain  bounded  even 
when  the  number  of  nodes  (fish)  making  up  the  school  is  arbitrarily  large.  This 
might  explain  why  large  fish  schools  can  move  together  and  maneuver  quickly  even 
while  forming  an  extremely  large  network  while  a  comparatively  small  number  of 
birds  flying  straight  find  it  difficult  to  keep  a  constant  separation. 
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Proof  of  Theorem  6.3.1.  Construct  a  graph  Q  —  ('L7,  *E)  whose  edge  set  “E  consists 
of  exactly  one  for  every  pair  of  parallel  edges  in  “E.  Construct  an  edge-weight 
function  W  :  TL  Sfc+  as  follows: 


WU7V  =  2  Wu>v  (6.8) 

It  can  be  verified  that  due  to  the  assumption  of  symmetry,  L  =  A^VVA^,  where 
Af,  is  the  basis  incidence  matrix  of  Q  w.r.t.  o  and  TV  is  a  block  diagonal  matrix 
with  We ,  e  G  “E  on  the  diagonal.  Recalling  the  definition  of  a  Dirichlet  Laplacian  of 
a  matrix  weighted  graph  from  Section  2.2.2,  we  see  that  L  is  exactly  the  matrix- 
weighted  Dirichlet  Laplacian  for  the  matrix- weighted  graph  Q  =  (‘L7,  “E)  with 
boundary  {o}  and  with  weight  WUjV  =  R~\  on  every  edge  (u,v).  Since  7  >  0,  and 
M  and  L  are  positive  definite  (see  Theorem  2.2.1),  we  conclude  that  (6.7)  is  an 
asymptotically  stable  system. 

We  further  re-write  (6.7)  as 

x  =  — 7M“1Ex  +  w  +  h, 

where  b  :=  7M-1E>>Vr  —  xal  and  w  :=  — 7M_1231Ve  is  a  white  noise  random 
process  with  auto-covariance  matrix  given  by 

E  [w(t1)wT(t2)}  =  72Jvr1®>VE[e(i1)eT(i2)]IE®TM-l 
=  725(ti  -  =  725(ti  -  t2) 3VT1, 

where  we  used  the  fact  that  ’BWBT  =  M.  Since  the  Lyapunov  equation 

-jM^LEoo  ~  yEooDlvr1  +  72M~1  =  0 

has  a  positive  definite  solution 
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it  is  straightforward  to  show  that  the  covariance  matrix  of  x  converges  to  E^.  In 
particular,  the  steady-state  covariance  matrix  of  the  relative  position  xu  :=  xu—x0 
is  given  by  k  x  k  diagonal  block  of  E^. 

From  Corollary  4.6.1,  this  diagonal  block  is  y/2  times  the  effective  resistance  R^0 
between  u  and  o  in  the  generalized  electrical  network  ( Q,R )  where  RUtV  =  WU.;V , 
(u,  v )  €  £.  Because  of  the  role  of  parallel  resistance  formula  (see  Proposition  4.6.1 
and  Section  2.2.3),  this  effective  resistance  in  the  network  ( G,R )  is  twice  the 
effective  resistance  in  the  network  (■ G,R ),  which  proves  the  theorem.  ■ 


6.4  Lower  bound  on  the  Dirichlet  Laplacian  spec¬ 
trum  from  effective  resistance 

In  this  section  we  will  establish  bounds  on  the  smallest  eigenvalue  of  the  Dirich¬ 
let  Laplacian  using  effective  resistances.  Motivation  for  obtaining  bounds  on  this 
eigenvalue  is  summarized  in  the  next  section. 

6.4.1  Role  of  the  Dirchlet  Laplacian  spectrum 

6. 4. 1.1  Convergence  rate  of  discrete-time  algorithms 

We  have  already  seen  in  Chapter  3  that  the  convergence  rate  of  a  distributed 
estimation  algorithm  depends  on  this  eigenvalue(cf.  Theorem  3.3.4).  The  number 
of  iterations  niter (e)  needed  to  drive  the  error  ratio  below  a  certain  value  0  <  e  <  1 
is  given  by 

1 

'^iter(e)  —  ©(t  TTT^’ 

/'minv-^y 
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where  L  is  the  Dirichlet  Laplacian.  Similar  results  hold  for  the  well-known  av¬ 
erage  consensus  algorithms,  in  which  the  nodes  of  a  multi-agent  network  update 
their  state  by  computing  the  average  of  their  states  with  that  of  their  neigh¬ 
bors  [138].  When  the  communication  graph  of  the  average  consensus  algorithm  is 
a  fixed  undirected  graph  Q,  and  the  nodes  run  the  average  consensus  algorithm  to 
reach  consensus  with  a  reference  node,  which  keeps  its  state  fixed,  then  the  error 
dynamics  of  the  nodes  can  be  expressed  as 

^h+i)  _  jx W; 

where  J  is  the  Jacobi  iteration  matrix  for  the  graph  Q  with  a  reference  nodes  o, 
which  was  defined  earlier  in  3.19.  Since  this  is  identical  to  the  error  dynamics  of  the 
Jacobi  algorithm  described  in  Section  3. 3. 1.2,  the  results  of  Theorem  3.3.4  apply. 
So  the  convergence  rates  of  both  the  Jacobi  and  average  consensus  algorithms  can 
be  directly  obtained  from  Amin(L). 

6. 4. 1.2  Convergence  rate  of  continuous-time  algorithms 

Recalling  the  formation  control  example  of  Section  6.3,  we  see  that  the  stability 
of  the  closed  loop  is  determined  by  that  of  the  system 

x  =  — (6.9) 

Therefore,  the  time  constant  of  formation  errors  is  given  by  the  smallest  eigenvalue 
of  the  matrix  D~1L,  which  is  given  by  — Amin(L).  The  dynamics  (6.9)  also 
represent  a  continuous  time  consensus  algorithm  [29],  where  nodes  of  the  network 
are  trying  to  reach  consensus  with  the  reference  o.  Therefore  the  time  constants 
of  such  consensus  algorithms  are  also  dependent  on  \min. 

Apart  from  the  examples  above,  we  will  see  in  Section  7.3.1  another  example 
of  Amin’s  role,  in  case  of  control  of  a  platoon  with  a  dynamic  compensator. 
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6.4.2  Effective  resistance  between  a  node  and  a  set  of 


nodes 

Before  proceeding  further,  we  define  the  matrix-valued  effective  resistance  be¬ 
tween  a  node  and  a  set  of  nodes,  which  can  be  thought  of  as  an  extension  of  the 
generalized  effective  resistance  between  two  nodes  defined  in  Section  4.4.2.  This 
effective  resistance  definition  is  useful  when  there  are  more  than  one  reference 
nodes,  e.g.,  when  there  are  multiple  leaders  in  a  formation. 

Consider  a  directed  graph  Q  =  (T'’,  “E)  and  let  Vr  C  V .  Recall  from  Sec¬ 
tion  2.2.1  that  Q  is  weakly  connected  to  “l^r,  where  if  there  is  a  (undirected)  path 
from  every  node  in  the  graph  to  at  least  one  of  the  nodes  in  4^.  Theorem  2.2.1 
shows  that  the  Dirichlet  Laplacian  L  of  the  network  ( Q ,  R )  is  invertible  if  and 
only  if  Q  is  weakly  connected  to  Vr.  Recall  that  L  =  AbWAj,  where  Ab  is  the 
generalized  basis  incidence  matrix  of  Q  and  W  is  the  block  diagonal  weight  matrix: 
W  :=  diag(W/i, . . . ,  Wm),  where  m  is  the  number  of  edges  (see  Section  2.2.2). 

We  now  formally  define  a  node  u’s  effective  resistance  to  4^.,  denoted  by 
Rf  W),  as  the  k  x  k  block  in  the  main  diagonal  of  L  1  corresponding  to  the 
k  rows/columns  associated  with  the  node  u  G  \  4 f.  This  terminology  is  justi¬ 
fied  by  the  fact  that  the  matrix  i?0ff(4{.)  also  express  a  map  from  (matrix- valued) 
currents  to  (matrix-valued)  voltages  in  an  appropriately  defined  electrical  network, 
especially  when  the  reference  node  set  consist  of  a  single  node  (see  Section  4.4.2). 
Here  we  restrict  ourselves  to  finite  networks,  unlike  Chapter  4. 
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6.4.3  Lower  bound  from  effective  resistance 


The  lower  bound  on  the  smallest  eigenvalue  of  the  Dirichlet  Laplacian  follows 
from  the  next  lemma.  Jzf  is  the  Laplacian  and  L  is  the  Dirichlet  Laplacian. 


Lemma  6.4.1  (Spectrum  of  L  and  Jzf).  Assume  that  Q  =  is  weakly 

connected  to  vr  c  V  and  denote  by  Ai(Jf)  <  Ai{A£}  <  •  •  •  <  Ank(FF)  the  sorted 
eigenvalues  of  and  by  A\(L)  <  A2(L)  <  ■  ■  ■  <  A (n-nr)fc(-£)  the  sorted  eigenval¬ 
ues  of  L.  For  every  i  G  (1,2...,  (tt,  —  nr)k} 


and 


A i{L)  > 


1 

EueWtracei?nfr(^) 


(6.10) 


A i(Jf)  <  \(£)  —  ^i+knr  (&)■  (6.H) 

□ 


Proof  of  Lemma  6.4-1.  By  definition,  the  effective  resistance  is  a  diagonal 

block  of  L~l .  The  inequality  (6.10)  is  a  consequence  of  the  fact  that  any  eigenvalue 
of  the  positive  definite  matrix  can  be  upper-bounded  by  its  trace,  which  can 
be  obtained  by  adding  up  all  the  traces  of  its  diagonal  blocks  Rff(fl/r),  u  G  (]/\  Vr. 
This  means  that  every  eigenvalue  of  L satisfies: 

Ai(L~l)  <  trace  L_1  =  trace  i?“ff(4/r), 

U&V\Vr 

from  which  (6.10)  follows  since  the  eigenvalues  of  L~l  and  L  are  reciprocals  of  each 
other.  The  inequality  (6.11)  is  a  direct  application  of  the  Interlacing  Eigenvalues 
Theorem  [139,  Theorem  4.3.15]  to  the  symmetric  matrices  L  and  Jzf,  upon  noting 
that  L  is  a  principal  submatrix  of  (cf.  definitions  of  L  and  in  Section  2.2.2). 
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^min  (L)\2(L0) 


Figure  6.3.  The  relationship  between  algebraic  connectivity  and  the  smallest 
eigenvalue  of  the  Dirichlet  Laplacian. 

To  understand  the  implication  of  this  result  and  its  use,  consider  the  case  when 
all  the  edge  weights  are  scalars  and  equal  to  unity,  i.e.,  W e  =  1  for  every  e  G  E. 
In  this  case  Jzf  =  L0,  where  L0  is  usual  graph  Laplacian  of  Q,  thinking  of  Q  as 
an  undirected  graph.  The  unweighted  Dirichlet  Laplacian  L  with  the  boundary 
{o}  is  a  principal  submatrix  of  L0  obtained  by  removing  the  row  and  column 
corresponding  to  o.  By  the  interlacing  of  eigenvalues  for  symmetric  matrices,  we 
have 


Amin(T)  ^  A2(L0), 

where  A2 (L0)  is  the  second  smallest  eigenvalue  of  the  Laplacian,  which  is  also 
called  the  algebraic  connectivity  of  Q  [82],  This  interlacing  is  shown  graphically 
in  Figure  6.3. 

Therefore  any  upper  bound  on  the  algebraic  connectivity  is  an  upper  bound  on 
Amin (L)  as  well,  but  a  lower  bound  on  the  algebraic  connectivity  is  not  necessarily 
a  bound  on  Amin  of  any  kind.  As  a  result,  although  an  extensive  literature  exists  on 
bounding  the  algebraic  connectivity  of  a  graph  (see  [140,  141],  and  especially  [81] 
for  a  good  overview),  these  results  are  not  useful  in  bounding  Amin  from  below.  On 
the  other  hand,  we  need  lower  bounds  on  Amjn  to  bound  the  worst  case  performance 
for  the  applications  discussed  in  Section  6.4.1. 


221 


6.5  Comments  and  open  problems 


In  this  section  we  saw  evidence  that  the  matrix  valued  effective  resistance 
introduced  in  earlier  chapters  is  useful  in  the  analysis  of  decentralized  control 
problems  as  well.  The  effect  of  graph  structure  in  propagating  noise  was  examined, 
and  it  was  shown  that  the  noise  propagation  can  be  characterized  by  the  effective 
resistance.  Effective  resistance  also  yielded  an  unexpected  benefit  -  a  lower  bound 
on  the  Dirichlet  Laplacian  eigenvalues  was  obtained  as  a  function  of  the  effective 
resistances.  This  bound  is  potentially  valuable  since  there  are  few  results  on  lower 
bounding  Dirichlet  Laplacian  eigenvalues,  which  appear  in  several  control  and 
estimation  problems. 

To  make  good  use  of  this  bound  requires  knowledge  of  the  effective  resistances 
in  the  graph.  Effective  resistances  have  seen  renewed  popularity  in  recent  years. 
Several  aspects  of  effective  resistances  in  graphs  have  been  investigated.  After 
the  seminal  work  of  [2]  on  transience  and  recurrence  of  random  walks  in  infinite 
graphs  using  effective  resistance,  Chandra  et  al.  [14]  showed  that  the  cover  and 
commute  times  of  a  random  walker  in  a  finite  graph  is  also  captured  by  the  effective 
resistances  in  the  graph.  The  sum  of  all  pairwise  effective  resistance  for  a  number 
of  special  graphs  have  been  studied  in  Zhang  and  Yang  [142],  whereas  Ghosh  et  al. 
[143]  provides  numerical  methods  for  minimizing  the  total  effective  resistance  by 
choosing  the  edge  resistances  appropriately  (subject  to  some  constraints).  Wu 
[120]  obtained  exact  formulas  for  effective  resistances  in  finite  2D  and  3D  grids. 
The  matrix-valued  effective  resistances  when  each  edge  has  a  constant  generalized 
resistance,  can  be  easily  obtained  from  these  results  by  employing  Lemma  4.6.1 
that  relates  the  two. 

Only  time-invariant  interconnections  were  examined  in  regards  to  both  error 
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propagation  and  stability  margin.  In  certain  situations  the  interconnections  be¬ 
tween  the  agents  might  vary  with  time.  Error  propagation  in  such  time-varying 
graphs  needs  is  an  interesting  open  problem.  Furthermore,  the  interconnection 
structure  considered  was  symmetric,  which  is  the  reason  the  Dirichlet  Laplacian 
and  effective  resistance  appears  naturally.  Revisiting  the  issues  examined  here, 
but  with  asymmetric  interconnection,  is  a  wide  open  problem. 

Another  limitation  of  the  formation  control  problem  described  in  Section  6.3 
is  that  the  agent  dynamics  and  the  control  laws  are  quite  simple,  with  agent 
dynamics  assumed  to  be  first  order  and  control  law,  proportional.  There  is  a  need 
to  examine  the  issues  of  error  propagation  in  multi-agent  control  systems  with 
more  complex  dynamics  and  control  laws.  One  such  problem  will  be  examined  in 
the  next  chapter. 
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Chapter  7 


Control  of  vehicular  platoons: 
symmetric  bidirectional  control 

7.1  Introduction 

In  this  chapter  we  study  the  problem  of  controlling  a  string  of  vehicles  mov¬ 
ing  in  one  dimension  such  that  they  all  follow  a  lead  vehicle  with  a  constant 
spacing  between  successive  vehicles  (c.f.  Figure  7.1).  The  capacity  of  highways 
can  be  increased  by  a  significant  amount  if  small  inter-vehicular  distances  can 
be  maintained  [6].  Since  human  drivers  cannot  be  expected  to  maintain  small 
inter-vehicular  distances  due  to  safety  reasons,  one  way  to  achieve  this  objective 
is  automated  driving  using  feedback  control.  A  successful  demonstration  of  a 
platoon  of  eight  vehicles  automatically  controlled  to  follow  a  lead  vehicle  was  con¬ 
ducted  in  1997  by  the  National  Automated  Highway  Systems  Consortium  under 
the  California  PATH(Partners  for  Advance  Transit  and  Highways)  program  [144], 
Due  to  its  relevance  to  developing  automated  highway  systems,  the  problem  of 
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Zi  Ei  A 

*Zi~  i 

Figure  7.1.  platoon  of  vehicles 

controlling  a  platoon  of  vehicles  has  been  studied  extensively.  We  will  discuss  the 
prior  work  done  in  this  problem  in  Section  7.2.1. 

The  automated  platoon  problem  is  a  special  case  of  the  formation  control 
problem  that  was  already  introduced  in  the  previous  chapter.  The  difference 
here  is  that  the  formation  we  are  trying  to  keep  is  in  M1  instead  of  being  in 
2  or  3  dimensions.  Simplification  in  the  spatial  dimension  is  accompanied  by 
a  complexihcation  in  other  aspect  of  the  problem  -  we  will  allow  higher  order 
dynamics  and  controllers  than  in  Chapter  6. 

Consider  a  platoon  of  N  vehicles  moving  in  one  dimension  following  a  lead 
vehicle,  as  shown  in  Figure  7.1.  The  lead  vehicle  moves  independently  of  the  other 
N  vehicles,  who  try  to  maintain  a  constant  gap  A  between  successive  vehicles. 
The  lead  vehicle  may  be  a  real  vehicle  with  its  own  dynamics,  or  a  fictitious  vehicle 
that  represents  a  reference  trajectory  provided  to  the  first  vehicle  of  the  platoon. 
Let  Z0{t)  denote  the  position  of  the  lead  vehicle  and  i  e  {1,  2, . . . ,  IV},  the 

position  of  the  ith  vehicle.  The  ith  vehicle  can  measure  the  errors  with  respect 
to  its  predecessor  and  follower,  namely,  Ci,i-i (t)  and  Ci+mW,  by  on-board  sensors 
such  as  radars,  where 

Cm-i(^)  :=  {%i- i(t)  —  Zi(t)  —  A)  + 

where  the  desired  spacing,  A,  is  a  positive  constant  and  e^-i (t)  is  measurement 
noise. 
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Our  interest  is  in  decentralized  control  architectures  in  which  every  vehicle 
computes  its  control  signal  based  on  locally  available  spacing  error  measurements. 
The  control  refers  to  the  signal  fed  to  the  actuator  that  drives  the  vehicle.  How 
the  control  signal  affects  the  position  and  velocity  of  the  vehicle  therefore  depends 
on  the  model  of  the  vehicle  dynamics.  Although  the  dynamics  of  a  highway  vehicle 
are  typically  non-linear,  they  can  be  converted  to  that  of  a  double  integrator  (i.e., 
a  point  mass  without  damping)  by  feedback  linearization  [145,  146].  However, 
since  the  dominant  aerodynamic  drag  on  a  vehicle  is  quadratic  in  its  velocity  (see 
the  models  described  in  [146,  147])  a  Jacobian  linearization  around  the  nominal 
velocity  results  in  a  model  that  has  a  single  integrator  in  series  with  a  low  pass 
filter.  We  will  consider  both  types  of  linear  models  of  vehicles,  with  two  integrators 
as  well  as  one. 

An  extensively  studied  decentralized  control  architecture  is  the  predecessor 
following,  in  which  the  control  action  on  a  particular  vehicle  depends  on  its  spacing 
error  with  the  predecessor,  i.e.,  the  vehicle  in  front  of  it.  However,  the  predecessor 
following  architecture  is  known  to  suffer  from  the  limitation  that  the  disturbances 
acting  on  the  vehicles  lead  to  large  inter-vehicular  spacing  errors.  We  will  discuss 
these  limitations  in  Section  7.2. 

Another  decentralized  control  architecture  that  is  investigated  in  the  literature 
is  bidirectional  control.  In  this  scheme,  the  control  action  on  a  particular  vehicle 
depends  on  the  spacing  errors  with  respect  to  its  predecessor  and  its  follower. 
Most  human  drivers  use  information  about  preceding  and  following  vehicles  to 
control  their  own  vehicles  -  especially  in  heavy  traffic,  so  bidirectional  control 
is  intuitively  appealing.  In  symmetric  bidirectional  control,  the  control  effort  is 
equally  dependent  on  the  spacing  errors  with  the  preceding  vehicle  and  the  fol¬ 
lowing  vehicle.  The  effect  of  disturbances  acting  on  the  vehicles  on  the  spacing 
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errors  with  symmetric  bidirectional  control  was  analyzed  by  Seiler  et  al.  [32],  who 
showed  that  when  the  vehicle  model  has  two  integrators  and  controller  does  not 
have  an  integrator,  the  disturbances  acting  on  the  vehicles  result  in  large  spacing 
errors. 

In  this  chapter  we  examine  symmetric  bidirectional  control  and  answer  the 
questions  left  unanswered  in  [32],  We  answer  the  question  of  stability  and  dis¬ 
turbance  amplification  when  the  vehicle  model  has  either  one,  or  more  than  two 
integrators;  and  also  characterize  the  effect  of  the  lead  vehicle’s  trajectory  on  the 
spacing  errors  of  all  the  vehicles.  The  results  are  independent  of  the  choice  of 
the  controller  but  are  due  to  the  interconnection  structure  imposed  by  symmetric 
bidirectional  architecture. 

Chapter  organization:  We  first  review  the  literature  on  the  control  of  platoons 
in  Section  7.2  and  then  briefly  summarize  our  results.  In  Section  7.2,  we  formulate 
the  problem  and  present  the  results  in  Section  7.3.  Every  theorem  is  about  a 
different  aspect  of  the  problem,  such  as  stability,  steady-state  tracking  error,  and 
disturbance  amplification,  and  hence  is  presented  in  a  separate  subsection.  An 
intuitive  explanation  of  the  results  is  provided  in  Section  7.3.5.  The  chapter 
concludes  with  comments  on  open  issues  in  Section  7.4. 


7.2  Prior  work  and  contributions 

7.2.1  Prior  work  on  vehicular  platoons 

There  is  an  extensive  literature  on  automated  vehicular  platoon  problem. 
Early  work  on  platoons  can  be  dated  back  at  least  half-a-century  ago,  to  [31]. 
Early  interest,  in  the  50 ’s  and  60 ’s,  in  this  problem  stemmed  from  the  proposals  to 
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build  Automated  Guided  Transit  (AGT)  system  with  electrically  powered  vehicles 
as  a  way  of  mitigating  increasing  urban  problem  of  “..congested  roadways,  large 
numbers  of  accidents  and  fatalities,  and  extremely  powerful  automobiles  [148]”. 
Due  to  space  limitations,  we  will  only  review  work  on  the  so-called  “constant¬ 
spacing  policy”,  in  which  the  goal  is  to  maintain  constant  inter- vehicular  sepa¬ 
ration.  Other  policies,  such  as  constant-time  headway  and  constant  safety-factor 
policies  [149]  will  not  be  discussed. 

Among  decentralized  schemes,  the  predecessor-following  architectures  was  the 
one  studied  the  first  and  perhaps  the  most  (see  [7,  149-151]  and  references  therein). 
Still,  it  was  recognized  early  on  that  disturbances  acting  on  the  vehicles  tend  to  get 
amplified  in  predecessor- following  control.  For  this  reason,  predecessor- following 
was  considered  not  stable  in  [31].  The  disturbance  amplification  tendency  is  usu¬ 
ally  referred  to  as  “string  instability”.  Although  a  precise  definition  of  string 
instability  -  mainly  motivated  by  the  platoon  problem  -  was  offered  much  later 
in  [152],  the  term  itself  was  in  vigorous  use  for  a  long  time.  Mention  of  the  phrase 
can  be  found  in  such  early  references  as  [149,  151,  151].  The  term  “spatial  asymp¬ 
totic  stability”  was  also  used  in  place  of  string  stability  in  [150].  More  recently, 
the  term  “slinky-type  effects”  has  also  been  used  to  describe  the  phenomenon  of 
error  amplification  in  vehicle  strings  [153]. 

Although  extensively  studied,  it  took  quite  a  while  before  a  thorough  under¬ 
standing  of  the  limitations  of  the  predecessor- following  architecture’s  was  devel¬ 
oped.  That  the  limitations  are  fundamental  in  nature  and  independent  of  the 
controller’s  design,  was  shown  much  later,  by  Seiler  et  al.  [32], 

As  the  limitations  of  predecessor-following  were  observed  quite  early,  it  led 
to  the  proposal  of  a  predecessor-and-leader-following  architecture,  in  which  the 
control  action  at  a  vehicle  depends  on,  in  addition  to  the  predecessor’s  error,  the 
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error  with  respect  to  the  lead  vehicle  in  the  platoon.  It  was  shown,  e.g.,  in  [151], 
that  such  a  scheme  will  in  fact  damp  out  disturbances  all  along  the  platoon. 
Clearly,  it  is  not  a  decentralized  architecture.  The  demonstration  of  automated 
platooning  in  1997  used  this  architecture  [144], 

Unfortunately,  the  predecessor-and-leader-following  scheme  also  suffers  from 
severe  limitations.  In  particular,  it  was  shown  in  Liu  et  al.  [154]  the  closed  loop 
becomes  highly  sensitive  to  the  time  delays  incurred  in  transmitting  the  lead 
vehicle’s  position  information  to  the  rest  of  the  platoon. 

The  LQR  control  of  platoons,  which  typically  leads  to  a  centralized  architec¬ 
ture,  was  investigated  as  early  as  1966  [155].  LQR  control  of  an  infinite  string  of 
vehicles  were  investigated  in  [156]  and  in  [157],  though  a  more  complete  analysis 
was  provided  only  in  2005  by  Jovanovic  and  Bamieh  [158].  It  was  shown  in  [158] 
that  the  optimal  control  of  the  platoon  is  effectively  ill-posed  when  the  number  of 
vehicles  is  large,  namely,  that  the  time  constant  of  the  closed  loop  become  arbi¬ 
trarily  small  as  the  number  of  vehicles  become  arbitrarily  large.  Since  the  optimal 
control  of  a  platoon  suffers  from  such  limitations,  it  is  perhaps  not  surprising  that 
the  decentralized  control  suffers  from  limitations  as  well. 

The  discussion  above  shows  that  even  the  centralized  schemes,  such  as  LQR 
and  lcader-and-predecessor-following  fail  to  remove  the  difficulties  of  decentralized 
predecessor-following  scheme.  Therefore,  it  behooves  us  to  examine  other  decen¬ 
tralized  architectures  that,  although  cannot  be  expected  to  not  suffer  from  any  of 
the  limitations  discussed  above,  at  least  has  the  potential  of  performing  reason¬ 
ably  well.  The  bidirectional  architecture  is  a  natural  choice,  which  was  analyzed 
in  [159]  and  claimed  to  not  suffer  from  string  instability.  This  claim  was,  however, 
erroneous  since  the  measure  of  string  stability  used  only  ensured  that  disturbances 
will  be  damped  out  in  going  from  front  to  the  back  of  the  platoon,  but  not  the 
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other  way  around.  Bidirectional  architecture  was  also  investigated  in  the  non¬ 
linear  setting  by  Zhang  et  al.  [153],  whose  proposed  design  was  able  to  achieve 
stability  without  “slinky-type  effects” ,  which  is  another  name  preferred  by  several 
researchers  in  place  of  string  instability.  In  the  linear  case,  the  disturbance  ampli¬ 
fication  properties  of  the  symmetric  bidirectional  scheme  were  examined  in  [32], 
which  showed  that  for  a  certain  class  of  plants  and  controllers  this  architecture 
also  suffers  from  limitations  that  are  controller- independent. 

7.2.2  Main  results 

The  results  presented  in  this  chapter  are  the  summarized  below.  All  the  results 
apply  only  to  the  case  of  symmetric  bidirectional  architecture,  and  when  the 
vehicle  and  controller  models  are  linear.  In  the  sequel,  the  plant  H(s)  denotes  the 
transfer  function  from  control  input  to  vehicle  position,  and  the  controller  K(s) 
denotes  the  transfer  function  from  the  position  error  to  the  control  input. 

1.  It  is  possible  to  design  the  controller  K(s)  so  that  the  closed  loop  is  stable 
for  an  arbitrarily  large  but  finite  number  of  vehicles,  as  long  as  the  number 
of  integrators  in  the  loop  H(s)K(s)  is  not  more  than  two.  When  H(s)K(s ) 
has  either  one  or  two  integrators,  if  H(s)  is  non-minimum  phase,  for  every 
stable  controller  K(s),  the  closed  loop  will  become  unstable  for  a  sufficiently 
large  number  of  vehicles.  If  the  total  number  of  integrators  in  H(s)K(s )  is 
more  than  two,  then  the  closed  loop  will  be  unstable  for  a  sufficiently  large 
number  of  vehicles,  irrespective  of  how  K (s)  is  designed. 

2.  When  H(s)K(s )  has  two  integrators,  if  the  lead  vehicle  moves  at  constant 
velocity,  the  steady  state  spacing  errors  for  every  vehicle  will  go  to  0,  irre¬ 
spective  of  the  number  of  vehicles  in  the  platoon.  When  H(s)K(s )  has  only 
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one  integrator,  if  the  lead  vehicle  moves  at  a  constant  velocity,  the  steady 
state  error  is  finite  for  a  finite  platoon  size,  but  the  norm  of  this  error  grows 
without  bound  as  the  number  of  vehicles  in  the  platoon  increases. 

3.  When  H(s)K(s )  has  two  integrators,  in  the  absence  of  disturbances  on  the 
vehicles,  if  the  lead  vehicle  trajectory  deviates  from  a  constant- velocity  one, 
the  £j2  norm  of  the  spacing  errors  will  grow  unbounded  as  the  number  of 
vehicles  increases,  even  if  the  deviation  has  bounded  X-2-norrn.  However, 
when  H(s)K(s)  has  only  one  integrator,  if  the  deviation  of  the  lead  vehi¬ 
cle’s  trajectory  from  a  constant  velocity  one  is  i^-norm  bounded,  then  the 
spacing  errors  of  the  entire  platoon  are  £2  norm  bounded,  too,  irrespective 
of  the  number  of  vehicles. 

4.  When  K(s)  has  no  integrators,  the  74 oo  norm  of  the  transfer  function  from 
the  disturbances  acting  on  the  vehicles  to  the  spacing  errors  will  grow  with¬ 
out  bound  as  the  number  of  vehicles  increases,  irrespective  of  whether  H(s) 
has  either  one  or  two  integrators.  Thus,  even  if  the  lead  vehicle  is  moving  at 
constant  velocity,  if  disturbances  are  present  in  the  control  signal  -  as  they 
invariably  will  -  large  spacing  errors  will  result  for  a  large  platoon. 

The  case  of  H(s)K(s)  having  no  integrators  is  not  considered,  since  a  realistic 
model  of  a  vehicle  for  a  highway  will  have  at  least  one  integrator.  In  fact,  vehicle 
models  as  simply  double  integrators  (fully  actuated  point  masses  with  no  damp¬ 
ing)  are  quite  common  in  the  literature  [32,  145,  146].  Even  models  with  three 
integrators  have  been  studied  in  the  literature,  e.g.,  the  vehicle  model  in  [147], 
which  result  from  feedback  linearization  of  high-order  non-linear  vehicle  models. 
Effect  of  disturbance  when  K(s)  and  H (s)  each  has  one  integrator  has  not  been 
studied  in  this  dissertation. 
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N  N  —1 


2  1  0 


Figure  7.2.  The  interconnection  architecture  graph  for  symmetric  bidirectional 
control.  The  symmetry  of  the  interconnection  graph  is  manifested  in  the  graph 
being  undirected. 

It  is  important  to  notice  that  the  results  are  independent  of  how  the  controller 
K(s)  is  chosen.  In  short,  the  limitations  of  the  symmetric  control  architecture 
cannot  be  ameliorated  with  clever  control  design.  The  results  in  this  chapter  have 
been  reported  earlier  in  [160].  The  results  for  H(s)  with  two  or  more  integrators 
were  also  proved  by  Yadlapalli  et  al.  [161]  independently. 

7.3  Problem  statement  and  main  results 

Let  N,  M  and  C  denote  the  set  of  natural,  real  and  complex  numbers,  respec¬ 
tively.  As  shown  schematically  in  Figure  7.1,  the  platoon  consists  of  N  vehicles 
moving  in  one  dimension  following  a  lead  vehicle,  where  the  lead  vehicle  moves 
independently  of  the  other  N  vehicles.  Let  Z0(t)  denote  the  position  of  the  lead 
vehicle  and  Z^t),  i  G  {1,2, .. . ,  N},  the  position  of  the  ith  vehicle.  The  spacing 
error  of  the  ith  vehicle  is  defined  by 

Ei(t)  =  Zi_1(t)r-Zi(t)-  A,  (7.1) 

where  the  desired  spacing,  A,  is  a  positive  constant.  The  lead  vehicle  may  be  a  real 
vehicle  with  its  own  dynamics,  or  a  fictitious  vehicle  that  represents  a  reference 
trajectory  provided  to  the  first  vehicle  of  the  platoon.  The  control  objective  is 
to  keep  the  spacing  error  for  every  vehicle  as  small  as  possible  while  maintaining 
closed  loop  stability.  We  assume  that, 


232 


1.  the  dynamics  of  individual  vehicles  are  identical,  and  the  transfer  function 
from  control  input  to  vehicle  position  is  denoted  by  H(s), 

2.  H(s)  is  SISO  and  has  at  least  one  integrator, 

3.  all  vehicles  use  the  same  control  law,  and 

4.  the  string  of  vehicles  start  with  zero  spacing  errors,  from  rest,  and  the  lead 
vehicle  starts  at  Z0( 0)  =  0.  Hence,  Z*( 0)  =  —i A. 

Let  X(s)  denote  the  Laplace  transform  £(•)  of  a  time-domain  signal  X(t): 

Z(s)  : =C(Z(t )). 

Applying  the  assumptions,  each  vehicle  can  be  modelled  in  the  Laplace  domain 
as 

*i(s)  =  H(s)(Ui(s)  +  Di(s))  +  1  <  i  <  N,  (7.2) 

s 

where  Zt(0)  is  the  initial  position  of  the  ith  vehicle,  Ui(s)  is  the  Laplace  transform 
of  the  control  signal,  and  D^s)  =  C{Di(t ))  is  the  Laplace  transform  of  the  input 
disturbance  D^t)  to  the  7th  vehicle.  The  effect  of  measurement  noise  can  be 
absorbed  into  the  input  disturbances  Di(t ),  so  from  now  on  we  can  assume  that 
the  noise  free  errors  Et(t)  can  be  measured  by  the  vehicles. 

The  All  spacing  error  in  the  Laplace  domain  is  given  by 

Ei(s)  =  Zj_i(s)  -  Zi(s)  -  — ,  1  <  i  <  N.  (7.3) 

s 

Using  (7.3)  and  (7.2)  and  using  Z,(0)  =  —i A,  we  can  write  the  error  dynamics  of 
the  entire  vehicle  platoon  as 

E(s)  =  Zo(s)0!  +  P(s)  [D(s)  +  U(s)]  (7.4) 
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where  (j>i  G  RN  is  the  1st  element  of  the  canonical  basis  of  RN  and 


E(s)  :=  C(E(t)),  E(t)  =  [E^t) . . .  EN(t)f , 
D(s)  :=  C(D{t))t  D(t )  =  [Di(t) . .  .  DN(t)f  , 
U(s)  :=  [U1(s)...UN(s)]T , 

P(s)  :=  - H(s)MT , 


where  M  is  defined  as 


ri  -i 


M  ■  = 


G  R 


NxN 


(7.5) 


l  ••-d 

In  a  symmetric  bidirectional  control  scheme,  each  vehicle  bases  its  control  action 
on  the  error  feedback  from  its  predecessor  and  follower  with  equal  emphasis.  The 
control  action  is 


Ui(s)  =  K(s)  {Ei{s)  -  Ei+l(s)) ,  1  <i<N.  (7.6) 


Since  the  last  vehicle  in  the  string  does  not  have  a  follower,  it  uses  the  controller 
Un(s)  =  K(s)En(s).  The  vector  of  platoon  control  inputs  is  given  by 

U(s)  =  K(s)ME(s) 


which  is  a  restatement  of  (7.6).  Eliminating  U(s)  from  (7.4)  we  can  write  the 
closed-loop  error  dynamics  of  the  platoon,  which  is  given  by 

E(s)  =  GXoe(s)Z0(s )  +  Gde(s)D(s),  (7.7) 


where 


GXoe(s)  =  [I  +  H(s)K(s)L]~1  <f>i, 

Gde(s)  =  -H(s)[I  +  H(s)K(s)L]-1Mt, 


and  L  :=  MTM  G  RNxN .  The  matrix  L  is  given  by 

r  i  -l  o  ...  -i 

-1  2  -1  ... 

L  —  0-12-1... 

'  ■  ■  -1 

L  -1  2  J 


(7.8) 

(7.9) 


(7.10) 
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The  matrix  L  is  similar  to  the  Laplacian  matrix  of  the  undirected  graph  whose 
nodes  are  the  vehicles  and  the  edges  are  the  measurements/communications  be¬ 
tween  neighboring  vehicles.  In  fact,  L  is  exactly  the  Dirichlet  Laplacian  (defined 
in  Section  2.2.2)  for  the  line  graph  with  TV  nodes  with  unity  edge  weights  and  a 
node  at  the  end  as  the  boundary,  that  is  shown  in  Figure  7.2.  This  graph  de¬ 
scribes  the  interconnection  structure  among  the  vehicles  in  an  TV- vehicle  platoon 
with  symmetric  bidirectional  control. 

The  results  on  various  aspect  of  the  problem,  including  closed  loop  stability, 
steady  state  error,  amplification  of  disturbance  in  the  lead  vehicle’s  trajectory, 
and  of  the  disturbances  acting  on  all  the  vehicles,  are  presented  next. 

7.3.1  Closed  loop  stability  with  symmetric  bidirectional 
control 

Theorem  7.3.1.  Consider  the  closed  loop  error  dynamics  of  the  platoon  with 
symmetric  bidirectional  control,  given  by  (7.7). 

1.  For  closed  loop  stability  of  the  platoon  with  arbitrary  TV,  H(s)K(s )  cannot 
have  more  than  two  integrators. 

2.  For  closed  loop  stability  of  the  platoon  with  TV  vehicles  following  the  leader, 
every  transfer  function  Gifs)  =  1/(1  +  \iH(s)K  (s)),  i  =  {1,2,...,  TV}  must 
be  stable,  where  A i  is  the  ith  eigenvalue  of  the  matrix  L  e  RNxN  defined  in 
(7.10),  and  consequently,  K(s )  cannot  have  zeros  at  0. 

3.  Define  H(s)K(s)  =  C(s)/sk  with  (7(0)  finite.  Then  for  closed  loop  stability 

with  arbitrary  TV,  we  must  have  (7(0)  >  0.  □ 
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We  have  already  discussed  that  for  H(s)  to  be  a  reasonable  model  of  a  vehicle 
in  a  highway,  H(s)  must  have  at  least  one  integrator.  The  theorem  above  shows 
that  for  the  closed  loop  to  be  stable  with  symmetric  bidirectional  control  for 
arbitrary  number  of  vehicles,  the  vehicle  dynamic  model  cannot  have  more  than 
two  integrators.  Moreover,  closed  loop  stability  for  arbitrary  N  is  impossible  when 
either  the  vehicle  dynamics  or  the  controller  is  non-minimum  phase,  ft  is  clear 
from  the  Theorem  that  stability  margin  will  be  determined  by  Amin(L),  which  is 
related  to  the  effective  resistances  in  the  control  architecture  graph  in  Figure  7.2 
(see  Section  6.4). 

7.3.2  Steady-state  errors 

In  an  automated  highway  system,  it  is  in  general  desired  that  the  vehicles 
move  at  a  constant  velocity  for  safety,  comfort,  and  fuel- efficiency.  This  can  be 
achieved  by  providing  a  constant  velocity  reference  to  the  first  vehicle  of  the 
platoon,  which  is  equivalent  to  introducing  a  fictitious  lead  vehicle  that  moves 
at  a  constant  velocity  at  all  times.  In  this  section  we  will  show  that  if  the  lead 
vehicle  moves  at  a  constant  velocity,  i.e., 

Zo(t)  =  Zf(t )  =  Vdt,  (7.11) 

where  where  Vd  is  the  desired  constant  velocity,  and  H(s)K(s)  has  two  integrators, 
then  all  the  platoon  spacing  errors  can  be  made  to  converge  to  0.  If  H(s)K(s )  has 
a  single  integrator,  then  the  steady  state  platoon  spacing  error  vector  is  non-zero, 
and  the  norm  of  the  steady-state  error  grows  without  bound  as  N  increases.  This 
is  stated  in  the  next  theorem. 

Theorem  7.3.2.  Consider  the  case  when  there  are  no  disturbances  acting  on  the 
vehicles,  i.e.,  D(t)  =  0,  and  the  lead  vehicle  moves  at  a  constant  velocity  at  all 
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times,  i.e.,  ZQft )  =  Vdt,  where  Vd  >  0  is  the  desired  constant  velocity.  Let  K(s ) 
be  such  that  it  achieves  closed  loop  stability  of  the  platoon  error  dynamics  with 
symmetric  bidirectional  control.  Then  the  following  are  true: 


1.  If  H(s)K(s)  has  two  integrators,  then,  ViV  e  N, 

lim  Eft)  =  0. 

t—>  OO 


2.  If  H(s)K(s)  has  one  integrator,  then  for  a  platoon  of  size  N,  e  RN , 

such  that, 


lim  Eft)  =  Eoc  ±  0, 

t—>  OO 

and,  for  every  R  >  0,  3 N0  e  N  such  that  ||-Eoo||  >  R,  ViV  >  Na,  where 
H^ooll  denotes  the  Euclidean  2-norm  of  the  N -vector  E^.  □ 


Since  the  steady  state  spacing  errors  grow  without  bound  as  the  platoon  size 
increases,  it  means  for  a  sufficiently  large  platoon,  there  might  be  collisions  be¬ 
tween  vehicles  in  the  platoon. 


7.3.3  Effect  of  lead  vehicle’s  deviation  from  constant  ve¬ 
locity 

In  certain  cases,  the  platoon  might  have  a  lead  vehicle  that  is  not  a  fictitious 
reference  but  a  real  vehicle  with  dynamics.  Due  to  disturbances  entering  the  lead 
vehicle,  it  is  reasonable  to  expect  that  the  leader  trajectory  will  deviate  from  the 
constant  velocity  reference  trajectory,  at  least  by  a  small  amount.  In  this  case, 
the  lead  vehicle’s  trajectory  can  be  modelled  as 

Z0ft)  =  Vdt  +  (oft), 
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where  ( 0(t)  is  the  error  from  the  constant-velocity  trajectory. 

We  effect  of  deviations  in  the  lead  vehicles  trajectory  from  a  constant  velocity 
one  on  the  spacing  errors  in  the  platoon  is  stated  in  the  next  theorem.  The  proof 
of  the  result  is  provided  in  Section  7.5.  In  the  statement  of  the  theorem,  ||  ■  || 
denotes  the  Euclidean  2-norm  of  a  real  or  complex  vector  and  ||  •  ||oo  denotes  the 
Tfoo-norm  of  a  transfer  function. 

Theorem  7.3.3.  Assume  H(s)K(s)  has  two  poles  at  the  origin,  the  closed  loop 
platoon  error  dynamics  under  symmetric  bidirectional  control  is  stable  for  arbi¬ 
trary  N.  Let  GXoe(s)  G  CJVxl  be  the  transfer  function  from  lead  vehicle  position 
Xa(s)  to  spacing  errors  E{s)  defined  in  (7.8).  Then, 

||Gb0e||oo  > 

where  (3  is  a  constant  independent  of  N. 

Since  GXoe(s)  is  also  the  transfer  function  from  Q0  to  E,  the  result  above  shows 
that  even  if  ||Co|U2  is  bounded,  H-Ejl^  will  grow  unbounded  as  N  increases.  The 
only  situation  when  low  spacing  errors  can  be  achieved  with  zero  steady  state 
error  for  all  vehicles  is  when  Coif)  =  0,  an  unlikely  scenario. 

It  turns  out  that  the  situation  is  better  when  H(s)K(s)  has  only  one  integrator, 
which  is  stated  next. 

Theorem  7.3.4.  Assume  H(s)K(s)  has  one  pole  at  the  origin,  the  closed  loop 
platoon  error  dynamics  under  symmetric  bidirectional  control  is  stable  for  arbi¬ 
trary  N.  Let  GXoe(s)  G  CJVx1  be  the  transfer  function  from  lead  vehicle  position 
X0{s)  to  spacing  errors  E{s)  defined  in  (7.8).  Then, 

II  GXt>e  ||  oo  c 

where  c  is  a  constant  independent  of  N.  □ 
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Proof  of  Theorem  7.3.3  is  provided  in  Section  7.5.  A  proof  of  Theorem  7.3.4  is 
not  provided,  since  it  follows  directly  from  the  arguments  provided  in  Section  7.3.5, 
which  offers  a  intuitive  explanation  of  these  results. 

7.3.4  Disturbance  propagation 

To  examine  the  effect  of  disturbances  acting  on  the  vehicles  in  the  spacing 
errors,  we  have  to  look  at  the  transfer  function  matrix  from  the  disturbances  to 
the  spacing  errors:  Grfe(s).  The  question  of  disturbance  propagation  was  already 
investigated  by  Seiler  et.  al.  in  [32],  where  it  was  shown  that  for  the  symmetric 
bidirectional  control  scheme,  it  is  not  possible  to  design  a  K(s)  to  achieve  an 
uniform  bound  on  ||  ||oo  w.r.t.  N,  when  H(s)  has  two  integrators  and  K(s)  has 
none.  It  follows  from  theorem  7.3.1  that  if  H(s)K(s)  has  three  integrators,  then 
the  closed  loop  platoon  error  dynamics  will  be  unstable  for  a  sufficiently  large 
N.  This  precludes  the  possibility  of  K(s)  having  an  integrator  when  H(s)  has 
two  integrators.  We  consider  only  the  case  of  H(s)K(s)  having  either  one  or  two 
integrators.  The  proof  is  provided  in  Section  7.5. 

Theorem  7.3.5.  Let  the  controller  K(s )  be  such  that  the  closed  loop  platoon 
dynamics  is  stable  for  arbitrary  N.  Let  Gde(s )  G  CNxN  be  the  transfer  function 
matrix  from  D(s)  to  E(s )  defined  in  (7.9).  If  K(s )  has  no  integrators,  then, 
irrespective  of  whether  H(s)K(s )  has  one  or  two  integrators,  ||  G'rfe  || oo  >  cN  for 
some  constant  c  independent  of  N.  □ 

This  theorem  tells  us  that  even  if  the  disturbances  acting  on  the  vehicles  are 
^2-norm  bounded,  the  A-2-norrn  of  the  spacing  errors  due  to  these  disturbances  will 
grow  unbounded  as  N  grows.  Therefore  a  symmetric  bidirectional  control  scheme 
is  not  scalable  with  respect  to  disturbance  rejection.  This  result  was  established 
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in  [32]  for  vehicle  models  with  two  integrators,  with  the  assumption  that  K(s) 
does  not  have  any  integrators.  The  theorem  above  shows  that  even  when  H(s) 
has  only  one  integrator,  the  result  in  [32]  holds.  What  happens  when  H(s)  and 
K(s)  each  has  one  integrator  is  an  open  question. 


7.3.5  Explanation  through  graph  eigenvalues 


We  now  provide  an  intuitive  explanation  of  the  degradation  of  performance  of 
the  symmetric  bidirectional  architecture  with  increasing  N  when  the  loop  transfer 
function  has  two  integrators.  This  explanation  uses  the  spectral  properties  of  the 
interconnection  graph.  In  particular,  the  minimum  eigenvalue  of  the  Dirichlet 
Laplacian  of  the  interconnection  graph  is  seen  to  have  a  profound  impact  on  the 
performance  loss.  We  will  need  the  following  result,  which  is  also  used  in  all  the 
proofs  of  the  theorems  of  this  chapter. 


Lemma  7.3.1.  consider  the  matrix  L,  as  defined  in  (7.10).  Let  \mm  be  the  small¬ 
est  eigenvalue  of  L,  and  let  u\  :  =  [«n,U2i,  ...,%i]r  be  a  unit-norm  eigenvector  of 
L  corresponding  to  A m;n.  Then  the  following  are  true: 


1  7 r2 

Jp  <  ^ min  -  Jp  ’  ^ N ■ 

|«n|  >  IV"1/2,  V7V. 


Now  the  explanation.  Since  L  is  symmetric,  3  U  G  MiVxiV  with  UTU  =  UUJ  = 
/  s.t.  L  =  UAUT  where  A  is  a  real  diagonal  matrix  containing  the  eigenvalues  of 
L  and  U  =  [rq,  «2, . . . ,  Mjv],  Ui  being  a  unit-norm  eigenvector  of  L  corresponding 
to  the  i tli  eigenvalue.  The  eigenvalues  are  arranged  as 

A  min  —  A2  3:  '  '  '  ^  A  JV_i  ^  A  max- 
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Figure  7.3.  Nyquist  plot  of  H(s)K(s)  when  H(s)K(s)  has  two  integrators. 
Hence, 


I  +  H(s)K(s)L  =  U(I  +  H(s)K(s)A)Ut 
=>  (/  +  H(s)K(s)L)~1  =  U(I  +  H(s)K(s)A)~1  UT. 
=>  GXoe(s)  =  (s)t/T0i 


where 


*00  = 


1 


1+ ^ min  H(s)I<(s) 


i 


i+A2ff(s)K(s) 


l+A  max  H(s)K(s ). 

Therefore,  for  a  fixed  omega,  the  2-norm  of  Gde(j w)  is  given  by 

,  1 

||G'a,oe(ja;)||  =  max 


l  +  A 

mate  H(ju)K(j  a.) 


(7.12) 


When  HK  has  two  integrators,  its  magnitude  at  very  low  frequencies  is  ar¬ 
bitrary  large  and  phase  is  arbitrarily  close  to  0.  As  shown  schematically  in  Fig¬ 
ure  7.3,  the  magnitude  of  the  phasor  |1  +  XminH(jou)K(juj)\  can  be  made  smaller 
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Figure  7.4.  Nyquist  plot  of  H(s)K(s)  when  H(s)K(s )  has  one  integrator. 

than  an  arbitrary  e  by  choosing  a  small  enough  u  and  a  correspondingly  small  Amin, 
which  in  turn  can  be  done  by  choosing  a  large  enough  N,  since  Amin  =  ©(1/iV2). 
This  is  the  reason  that  when  the  number  of  vehicles  increases  without  bound, 
||Gz0e||oo  grows  without  bound. 

If  H(s)K(s)  has  only  one  integrator,  since  H(s)  has  at  least  one  integrator 
by  assumption,  K(s)  has  no  integrator.  It  is  clear  from  the  Nyquist  plot  of 
H(s)K(s),  a  sample  one  is  shown  in  Figure  7.4  that  the  length  of  the  phasor  |1  + 
\iH(juj)K(juj)\  can  be  kept  larger  than  a  positive  constant  for  all  i,  no  matter  how 
large  N  is,  by  an  appropriate  choice  of  K .  The  reason  for  this  is  that  maximum 
A i  is  4  and  the  eigenvalues  smaller  than  1  only  increase  |1  +  \H{juj)K{ju)\.  This 
is  the  reason  that  ||GZoe||0O  stays  bounded  no  matter  what  N  is. 

7.4  Comments  and  open  problems 

Control  of  vehicular  platoons,  usually  posed  as  an  interconnection  of  point 
masses,  has  practical  implications  for  automated  highways.  However,  even  af¬ 
ter  five  decades  of  study,  it  has  proved  difficult  to  come  up  with  a  satisfactory 
solution,  that  consists  of  an  interconnection  architecture  and  an  associated  con- 
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trol  algorithm  that  ensures  guaranteed  level  of  stability  margin  and  robustness  to 
disturbances. 

The  bidirectional  architecture  was  the  natural  one  to  study  when  the  limita¬ 
tions  of  the  predecessor- following  architecture  became  apparent.  Although  the 
motivation  for  a  bidirectional  architecture  is  clear,  the  reason  for  studying  the 
symmetric  version  of  it  is  little  more  than  convenience.  Since  designing  separate 
controllers  for  individual  vehicles  is  a  challenging  task,  the  problem  is  simplified 
by  imposing  an  arbitrary  symmetry.  The  results  in  this  chapter  indicate  that 
the  symmetric  bidirectional  architecture  suffers  from  limitations  that  cannot  be 
ameliorated  by  better  controller  design. 

The  immediate  question  is,  of  course,  if  it  might  be  possible  to  do  better 
by  removing  the  symmetry  in  the  interconnection  architecture.  This  question  is 
answered  in  the  positive  in  the  next  chapter. 

The  other  issue  identified  here  is  that  the  difficulty  of  the  platoon  problem 
comes  from  the  interplay  of  the  interconnection  topology  (manifested  in  the  graph 
eigenvalues)  and  the  unbounded  gain  and  large  negative  phase  of  the  vehicle  model 
at  low  frequencies.  It  seems  that  disturbance  amplification  occurs  regardless  of 
whether  the  vehicle  model  has  one  or  two  integrators,  but  the  single  integrator 
case  is  less  sensitive  to  disturbances  cause  by  the  lead  vehicle.  An  answer  to  the 
question  of  what  happens  when  the  vehicle  model  and  the  controller  each  has  one 
integrator  has  not  been  answered  here  and  needs  to  be  examined. 

In  the  work  of  Zhang  et  al.  [153]  on  bidirectional  control  with  non-linear  plant 
models  and  non-linear  controller,  the  proposed  controller  was  able  to  achieve 
closed  loop  stability  without  any  “slinky  type”  effects.  However,  due  to  the  ex¬ 
ceedingly  complex  vehicle  model  used  (with  engine  speed,  brake  torque,  manifold 


243 


pressure  etc  all  appearing  in  the  control  design),  it  is  not  clear  how  the  design  was 
able  to  avoid  slinky-type  effects.  Still,  the  results  in  [153]  indicate  that  the  per¬ 
haps  the  problem  of  vehicular  platoon  control  should  be  studied  in  the  non-linear 
setting  for  a  better  chance  at  avoiding  the  difficulties  identified  in  this  chapter 
and  elsewhere. 

7.5  Proofs 

First  we  provide  the  proof  of  Lemma  7.3.1,  which  is  used  in  all  subsequent 
proofs. 

Proof  of  Lemma  7.3.1.  The  matrix  M  is  non-singular,  since  det(M)  =  1,  from 
(7.5).  Since  L  is  a  product  of  a  square  matrix  M  and  its  transpose,  and  M  is 
non-singular,  L  is  positive  definite  [100].  Since  L  =  LT ,  all  eigenvalues  of  L~l  are 
positive  real.  So  the  smallest  eigenvalue  of  L  is  the  inverse  of  the  largest  eigenvalue 
of  L_1 .  Note  that  L _1  is  given  by 

N  N—l  ...  2  1 

N—l  N—l  ...  2  1 

2  2  ..."  2  1 

1  1  ...  1  1 

To  prove  it,  simply  multiply  the  matrix  with  L  and  check  that  an  identity  matrix 
results.  From  Gerschgorin  circle  theory,  we  know  that  an  upper  bound  for  the 
largest  eigenvalue  of  L~1  is  XXl  +  2  +  ■  •  •  +  N)  <  N2.  Therefore,  a  lower  bound 
for  the  smallest  eigenvalue  of  L  is  l/N2.  That  is,  Amm  >  l/N2.  To  get  the  upper 
bound  on  A min,  let  us  write  L  as 
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where  ^i(jv-i)  is  the  hrst  element  of  the  canonical  basis  vector  of  and  L\  e 

-^n-ixn-i  .  j|.  ^urns  out  that  Li  is  the  so-called  finite-difference  matrix.  From 
Cauchy’s  Interlacing  Theorem,  we  know  that  A min  <  /imm ,  where  fimm  is  the 
smallest  eigenvalue  of  L\.  It  is  known  [100]  that  nmin  —  4sin2(7r/2AQ.  Moreover, 
for  6  >  0,  sin  6  <  6.  Hence,  /imm  <  tt2/N2,  which  establishes  the  upper  bound  on 

A  min- 

To  prove  the  second  statement,  note  that  since  U\  is  an  eigenvector  of  L  corre¬ 
sponding  to  the  smallest  eigenvalue  of  L,  u\  is  also  an  eigenvector  of  L~1  corre¬ 
sponding  to  its  largest  eigenvalue.  Since  L~x  is  a  positive  matrix,  Perron-Frobenius 
theory  tells  us  that  |wi|  :=  {|un|,  • .  • ,  |wjvi|}  is  also  an  eigenvector  of  L_1  corre¬ 
sponding  to  its  largest  eigenvalue  and  that  |«i|  is  a  positive  vector.  Thus,  we  can 
make  the  unit-norm  eigenvector  u\  of  L ,  corresponding  to  A mjn,  consist  entirely 
of  positive  numbers.  Let’s  write  down  the  equation  Lu\  =  Xminu \  in  expanded 
form: 


^11  —  ^21 

Xmin^ll 

—Mil  +  2m21  —  ^31 

Amm^21 

— M21  +  2m31  —  m41 

Amin  ^31 

It  is  easy  to  check  from  these  equations  and  the  positivity  of  un's  that  ua's  form 
a  decreasing  sequence:  Un  >  u2\  >  . .  .um  >0.  Since  J^u2 1  =  1,  it  follows  that 
u\x  >  1/N.  This  proves  the  Lemma.  ■ 

Proof  of  Theorem  7.3.1.  We  start  by  proving  the  second  statement,  which  is  simi¬ 
lar  to  the  results  established  by  Fax  et.  al.  [128]  about  the  role  played  by  the  eigen¬ 
values  of  the  graph  Laplacian  on  formation  stability.  It  is  also  easy  to  see  once  we 
simplify  equation  (7.8).  Since  L  is  symmetric,  3  U  G  18LNxN  with  UTU  =  UUT  =  I 
s.t.  L  =  UAUT  where  A  is  a  real  diagonal  matrix  containing  the  eigenvalues  of  L 
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and  U  —  [ui,U2,  ■  ■  ■  ,un\,  Ui  being  a  unit-norm  eigenvector  of  L  corresponding  to 
the  ith  eigenvalue.  The  eigenvalues  are  arranged  as 


A min  —  A  2  —  '  '  '  E  ^N—  1  E  A max ■ 


Hence, 


I  +  H(s)K(s)L  =  U{I  +  H(s)K(s)A)Ut 
(/  +  H(s)K(s)L)~1  =  U(I+(HK)A)~1UT. 


Using  the  above  and  (7.8),  we  get 


GXoe(s )  =  U{I  +  H{s)K{s)A)~1Ut(I>1 
=  U^f(s)UT<f>1, 


where  the  matrix  T(s)  G  <CNxN  is  dehned  as 


tf(s)  :=  (I  +  H(s)K(s) A) 


-l 


i 


1H-  ^ min  H(s)I<(s) 


1 


1+A  max 


This  gives  us,  with  (7.13),  that 


r  n 


GXoe(s)  — 


E 


Gr 1 + w(3)k(s) 

n 


u 


E 


l  +  \,H(s)K(s) 


1  i 


JV 

E 


-uuum 


(7.13) 


(7.14) 


(7.15) 


L  1  +  4,ff(s)A'(s) 

It  is  clear  now  that  for  the  closed  loop  to  be  stable,  each  of  the  transfer  functions 
1/(1  +  A iH(s)K(s))  for  i  e  {1,  2, . . . ,  N }  must  be  stable.  As  a  consequence,  there 
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cannot  be  any  unstable  pole  zero  cancellation  between  H(s)  and  K(s).  Since  H(s) 
has  at  least  one  integrator  by  assumption,  K(s)  cannot  have  any  zeros  at  0. 

To  prove  (1),  we  consider  the  root  locus  of  the  system  1  + XminH(s)K(s).  Suppose 
H(s)K(s)  has  three  integrators.  Then,  at  least  one  of  the  branches  of  the  root 
loci  will  depart  to  the  right  half  plane  for  an  arbitrarily  small  value  of  Amm,  even 
though  it  may  eventually  return  to  the  left  half  plane  for  a  large  enough  value. 
Since  Amm  can  be  arbitrarily  small  for  N  arbitrarily  large  (Lemma  7.3.1),  this 
means  that  1/(1  +  A minH(s)K(s))  will  be  unstable  for  a  large  enough  N.  Thus, 
H(s)K(s)  cannot  have  three  integrators.  The  extension  of  these  arguments  to  the 
case  of  more  than  three  integrators  is  trivial. 

To  prove  (3),  let  C(s)  =  Nc(s)/Dc(s)  where  Nc(s)  and  Dc(s)  are  coprime  poly¬ 
nomials.  From  the  above,  C(s)  cannot  have  zeros  at  the  origin.  Therefore  C(s) 
does  not  have  poles  or  zeros  at  the  origin,  so  Nc(0)  ^  0  and  Dc( 0)  ^  0.  Consider 
the  case  when  H(s)K(s )  has  two  integrators,  so  the  characteristic  polynomial  of 
1/(1  +  A minH{s)K{s))  is  s2Dc(s )  +  A minNc(s).  If  iVc(0)  <  0,  then  AminlVc(0)  <  0 
and  the  closed  loop  will  have  at  least  one  unstable  pole.  Thus  Nc( 0)  >  0.  The 
coefficient  of  s 2  in  the  characteristic  polynomial  is  Dc( 0)  +  AmmC2,  where  c 2  is  the 
coefficient  of  s 2  in  Nc(s).  If  Dc( 0)  <  0  the  coefficient  of  s 2  will  be  negative  when 
A  min  is  small  enough,  i.e.,  for  a  large  enough  N,  even  when  C2  is  positive.  This 
will  make  the  closed  loop  unstable.  Thus,  in  order  to  have  closed  loop  stability 
for  arbitrary  N,  we  must  have  Z)c(0)  >  0.  Hence,  (7(0)  >  0.  These  arguments  can 
be  repeated  for  the  case  when  H(s)K(s )  has  one  integrator,  and  we  arrive  at  the 
same  result.  This  proves  the  theorem.  ■ 

Proof  of  Theorem  7.3.2.  When  H(s)K(s)  has  a  double  integrator,  we  can  repre¬ 
sent  H(s)K(s )  as  C(s)/s2,  where  C(s)  does  not  have  poles  or  zeros  at  zero  and 
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(7(0)  >  0.  This  follows  from  theorem  7.3.1.  Consider  the  spacing  error  of  the  kth 
vehicle.  Since  x0(t)  =  vQt,  so  X0(s)  =  va/s2.  This,  together  with  equations  (7.15) 


and  (7.7)  gives  us 


sEk(s)~  S2  +  x°C(s)UliUki 

i= 1  '  ' 


Since  K(s)  stabilizes  the  platoon  dynamics,  1  /(s2  +  A*(7(s))  is  a  stable  transfer 
function  for  i  G  {1,2, .. . ,  N}  and  therefore  lims^0  svQ/(s2  +  A*(7(s))  =  0.  Since 
Uij’ s  are  bounded  numbers,  sEk(s)  — >  0  as  s  — >•  0.  Hence,  from  the  Final  Value 
Theorem,  Hindoo  e(t)  =  linn^o  sE(s)  =  0.  This  proves  the  first  statement  of  the 
theorem. 


Now  we  consider  the  case  of  H(s)K(s )  having  only  one  integrator.  We  can  rep¬ 
resent  H(s)K(s)  as  C(s)/s  where  C(s)  doesn’t  have  poles  or  zeros  at  the  origin 
and  (7(0)  >  0  (theorem  7.3.1).  Since  Aro(s)  =  v0/s2,  we  have 

sE(s)  =  VsT(s)^Vt0i  =  f/g(s)f/T0i, 

sz 

where  Q(s)  is  defined  as 

Vo 

S+ A min  c(s)  ! 

Q(s):= 

Vo 

S+A  maxC{s) 

Hence,  once  again  from  the  Final  Value  Theorem, 

lim  e(t)  =  lims£,(s)  =  UQ(0)U'4>i  e^, 

t — >oo  s— >0 

which  is  a  constant  vector.  Thus  the  steady  state  error  converges  to  a  constant 
vector. 

To  prove  that  grows  unbounded  with  N,  note  that 

lleoolll  =  <fiUQ(0)TQ(0)UT<j>1.  (7.16) 
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Since  f/T0i  is  the  first  row  of  U  and  Q(0)  is  a  real  diagonal  matrix,  we  can  reduce 
(7.16)  to 


XminC(0) 


U  ll 


From  lemma  7.3.1,  we  have  1/Amm  >  N2/ir 2  and  |mh|  >  1/N1^2.  Using  these  in 
the  above,  we  get  1 1 eoo 1 1 2  >  7 IV3/2,  where  7  =  vo/C(0)tt2.  Since  this  lower  bound 
is  an  increasing  function  of  N,  the  second  part  of  the  theorem  follows  immediately. 


The  following  technical  result  will  be  needed  for  the  proof  of  Theorem  7.3.3. 


Lemma  7.5.1.  Let  C(s )  be  a  SISO  transfer  function  that  has  no  poles  or  zeros  at 
the  origin  and  (7(0)  >  0.  Then,  3/3  G  (0,  +00)  and  3Na  e  N  such  that  V1V  >  Na, 


sup 

UJ 


1 

1  _  Xmin{N)C(ju) 


>  (3N. 


where  Amin(lV)  is  the  smallest  eigenvalue  of  the  matrix  L  e  RNxN  defined  in 
(7.10).  □ 


Proof  of  Lemma  7.5.1.  First  we  will  establish  that  \C(ju>)  —  (7(0)  |  <  for  some 
positive  constant  7  when  u  is  small  enough.  Let  C(s)  =  Nc(s)/Dc(s),  where  Nc(s ) 
and  Dc(s)  are  coprime  polynomials  in  s  (with  real  coefficients)  with  degrees  m 
and  n,  respectively.  We  write  down 

£,/  x  _  -A7(s)  _  ZmSm  +  '  •  '  +  Z\S  +  za 
Dc(s)  pnsn3 - \-piS+p0 

where  za  and  pQ  are  non- zero  since  C(s)  does  not  have  poles  or  zeros  at  the  origin. 
Expanding  the  expression  for  C(s)  —  (7(0)  and  doing  a  little  algebra,  we  see  that 

skQ(s) 


C(s)  -  C( 0)  = 


D{s)po 
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where  Q(s)  is  a  polynomial  in  s  with  a  non-zero  constant  term  and  k  >  1.  Since 
Q(s)  and  D(s)  both  have  non-zero  constant  terms,  lirn^_0  D{jJ)p  =  Wo Jp  ^  0- 
Therefore,  3  ua  s.t.  if  Icnl  <  u0,  then  I  I  <  I  I  +  1  =:  7.  Therefore  we 

get  that  there  exist  to0  >  0,7  >  0  and  an  integer  k>  1  s.t.,  V|cn|  <  min(l,o;0), 

\C(jcv)-C(0)\<\uk\7<\u\r  (7.17) 


Dehne 


/M  = 


1  - 


A  min(N)C(ju}) 


Pick  N0  such  that  u*  :=  \J Amm(7f)C(0)  G  (0,  min(l,  u>0)),  VIV  >  Na.  Hence, 

1  C(0) 


/(“>*)  = 


|1-|J|  \c(ju;*)-cm 


>  C{ 0)/7o/ 


The  last  inequality  follows  from  (7.17).  Substituting  the  value  of  u;*,  we  get 


viv  >  iv0. 

yA  7 

/  rmn 


(7.18) 


From  lemma  7.3.1,  we  know  that  1  /Amm(fV)  >  N2/n2.  Using  this  in  the  inequality 
(7.18),  we  get 

/(a/)  >  /31V,  VIV  >  1V0  (7.19) 

where  (3  :=  (U7(0) /y2^2)1/2  is  a  positive  constant.  This  proves  the  lemma.  ■ 

Now  we  are  ready  to  prove  Theorem  7.3.3. 

Proof  of  Theorem  7.3.3.  Since  H(s)K(s)  has  two  integrators,  H(s)K(s )  can  be 
written  as  C(s)/s2.  From  theorem  7.3.1,  it  follows  that  C(s)  cannot  have  poles 
or  zeros  at  0  and  (7(0)  >  0.  From  (7.13),  we  get 


I|G«„«WI|2  =  fafTZ  =  ftfu*-(s)*(s)uT<h. 
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Since  the  vector  UT(j)i  is  the  first  row  of  U,  using  (7.14),  this  reduces  to 


N 


I|g*»||2= 


.  2=1 


1  +  A  iH(s)K(s) 


1/2 


The  Tioo  norm  of  the  transfer  function  vector  GXoe  is: 


(7.20) 


1 1  GXoe  1 1  oo  sup\\GXoe(ju)\\2. 


(7.21) 


Thus, 


1 1 G Xoe  1 1  oo  Slip 


“11 


(7.22) 


1  +  A  minH(ju)K(ju) 

We  can  now  apply  the  result  established  in  lemma  7.5.1  to  claim  that  3/3  G 
(0,  Too)  and  3Na  E  N  such  that 

1 


sup 

UJ 


>  pN  VN>N0. 


1  +  \min(N)H(juj)K(juj) 

From  lemma  7.3.1,  we  know  that  |mh|  >  1  /y/N.  Using  these  two  inequalities  in 
(7.22),  we  get 

\\GXoe\\00>  (3N1'2  \JN>N0. 

Since  this  lower  bound  grows  unbounded  as  N  increases,  the  result  follows  imme¬ 
diately.  ■ 


Proof  of  Theorem  7.3.5.  When  K(s)  has  no  integrators,  and  H K  has  k  integra¬ 
tors,  where  k  is  either  one  or  two,  H  must  of  the  form  H(s)  =  Hi(s)/sk  with 
Hi(0)  >  0.  From  (7.9)  and  using  L  =  MTM,  we  get 

Gde(s)  =  H(s)[I  +  H(s)K(s)MtM]-1Mt  =  H(s)sk[M~Tsk  +  C(s)M]~l  (7.23) 


Therefore, 


l|G*(0)||  =  | i^M-1 


l|G*(0)||  = 
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since 


whose  2-norm  is  N.  This  implies  1 1 de 1 1 oo  =  snpw  l|G?de(ju;)||  >  ||G<fe(0)||  >  cN  for 
some  constant  c  independent  of  N.  This  proves  the  statement.  ■ 
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Chapter  8 


Control  of  vehicular  platoons: 
asymmetric  bidirectional  control 

8.1  Introduction 

In  this  chapter  we  revisit  the  problem  examined  in  the  previous  chapter  -  that 
of  decentralized  control  of  a  string  of  vehicles  moving  in  a  straight  line  in  order 
to  maintain  constant  inter-vehicular  separation.  In  the  general  bidirectional  case, 
there  is  no  reason  for  a  vehicle’s  controller  to  put  equal  weights  on  the  front  spacing 
error  (i.e.,  the  error  w.r.t.  to  the  preceding  vehicle’s  relative  position)  and  back 
spacing  error  (i.e.,  the  error  w.r.t.  to  the  following  vehicle’s  relative  position). 
Moreover,  the  controllers  in  one  vehicle  should  be  allowed  to  be  different  from 
those  in  other  vehicles.  The  difficulty  of  this  general  problem  is,  however,  that  we 
are  faced  with  the  task  of  designing  2N  controllers  when  there  are  N  vehicles. 

Due  to  the  challenging  nature  of  this  problem,  we  resort  to  a  continuum  ap¬ 
proximation  of  the  platoon  dynamics  in  the  form  of  a  partial  differential  equa- 
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tion(PDE).  The  continuum  approximation  turns  out  to  be  quite  useful  in  pro¬ 
viding  insight  into  the  problem,  which  led  to  the  improved,  “mistiming” -based, 
design  described  in  this  chapter.  Although  the  continuum  approximation  is  made 
under  the  assumption  of  large  number  of  vehicles,  the  resulting  design  and  analysis 
show  that  the  benefits  are  tangible  even  for  small  number  of  vehicles. 

Chapter  organization:  We  start  with  a  summary  of  the  results  in  Section  8.2 
and  state  the  problem  in  Section  8.3  in  formal  terms.  Unlike  some  of  the  previous 
chapters,  we  do  not  have  a  “results”  section,  since  we  need  to  describe  the  PDE 
model  of  the  platoon  dynamics  in  order  to  state  the  results.  Section  8.4  describes 
the  derivation  of  the  PDE  model.  In  section  8.5  the  PDE  is  analyzed  to  explain  the 
loss  of  stability  with  increasing  number  of  vehicles,  and  section  8.6  describes  how 
to  ameliorate  such  loss  of  stability  by  mistiming.  Section  8.7  reports  time-domain 
simulation  results  that  show  the  benefit  of  mistuing. 


8.2  Contributions  and  prior  work 

The  contributions  of  the  work  reported  in  this  chapter  are  briefly  summarized 
below. 

1.  In  order  to  facilitate  the  analysis,  we  derive  a  linear  partial  differential  equa¬ 
tion  (PDE)  based  continuous  analogue  of  the  (spatially)  discrete  platoon 
dynamics.  The  PDE  model  is  inspired  by  the  extensive  literature  on  PDE 
based  models  of  traffic  dynamic  (see  the  review  [162]  and  references  therein, 
and  the  PDE  model  of  a  string  of  vehicles  considered  in  [163]).  However, 
a  PDE  based  model  of  a  controller  platoon  is  a  novel  contribution  of  our 
work. 
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2.  The  PDE  model  is  used  to  derive  a  controller  independent  conclusion  on  sta¬ 
bility  with  symmetric  bi-directional  architecture.  In  particular,  the  behavior 
of  the  least  stable  eigenvalue  of  the  discrete  platoon  dynamics  is  predicted  by 
analyzing  the  PDE  spectra.  We  show  that  the  least  stable  closed-loop  eigen¬ 
value  approaches  zero  as  O(-^j).  This  prediction  is  confirmed  by  numerical 
computation. 

3.  We  show  that  an  arbitrary  small  perturbation  (asymmetry)  in  the  controller 
gains  from  their  nominal  (symmetric)  values  can  improve  the  closed-loop 
damping  such  that  the  least  stable  eigenvalue  now  approaches  0  only  as 
O(jj).  Numerical  computations  of  eigenvalues  in  discrete  platoons  is  used 
to  validate  these  results. 

Perhaps  more  important  than  the  improvement  itself  is  the  intuition  into  the 
problem  that  the  PDE  provides,  which  made  the  improvement  possible  in 
the  first  place.  The  PDE  reveals,  better  than  the  discrete  state-space  equa¬ 
tion  does,  the  underlying  cause  of  progressive  loss  of  stability  with  a  sym¬ 
metric  bidirectional  architecture  and  suggests  a  mistuning-based  approach 
to  improve  the  stability  margin  by  introducing  asymmetry.  In  particular, 
forward-backward  asymmetry  in  the  control  is  seen  to  be  beneficial.  The 
asymmetry  refers  to  the  assignment  of  controller  gains  such  that  a  vehicle 
utilizes  information  from  the  preceding  and  following  vehicles  differently. 
We  also  show  how  to  achieve  the  best  improvement  in  closed-loop  stability 
by  exploiting  this  asymmetry. 

Prior  work:  Prior  work  on  the  control  of  vehicular  platoons  has  been  re¬ 
viewed  in  the  previous  chapter;  see  Section  7.2.1.  The  idea  of  using  non-identical 
controllers  to  improve  robustness  to  disturbance  of  the  closed  loop  platoon  has 
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(a)  A  platoon  with  fictitious  lead  and  follow  (b)  Same  platoon  in  y  co¬ 
vehicles.  ordinates. 

Figure  8.1.  A  platoon  with  N  vehicles  moving  in  one  dimension. 

been  considered  in  [164],  However,  in  the  design  proposed  in  [164],  the  controller 
gains  at  individual  vehicles  grow  without  bound  as  N  increases.  In  contrast,  the 
mistiming  based  design  proposed  in  this  chapter  keeps  controller  gains  uniformly 
bounded  within  any  prescribed  value,  independent  of  the  number  of  vehicles.  The 
role  of  asymmetry  in  the  interconnection  architecture  on  improving  closed  loop 
stability  is  novel. 

PDE  modeling  of  traffic  flow  is  quite  well  developed,  see  [162]  and  references 
therein  for  a  thorough  review  of  this  topic.  We  note  that  the  mistiming  based 
approaches  have  been  used  for  stability  augmentation  in  many  structural  appli¬ 
cations;  see  [165-168]  for  some  recent  references. 


8.3  Problem  statement 

Consider  a  platoon  of  N  identical  vehicles  moving  in  a  straight  line  as  shown 
schematically  in  Figure  8.1(a).  For  these  vehicles,  we  consider  the  following  two 
scenarios  as  tabulated  in  Table  8.1.  In  scenario  I,  we  introduce  (after  [158,  169]) 
a  fictitious  lead  vehicle  and  a  fictitious  follow  vehicle,  indexed  as  0  and  N  +  1 
respectively.  Their  behavior  is  specified  by  imposing  a  constant  velocity  trajecto- 
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ries  as  Z0(t)  =  Vdt  and  Z^+i  =  Vdt  —  (N  +  1)A.  In  scenario  II,  only  a  fictitious 
lead  vehicle  with  index  i  =  0  with  Z0(t)  =  Vdt  is  introduced.  For  the  last  vehicle 
in  the  platoon  in  scenario  II,  there  is  no  follower  vehicle  and  it  uses  information 
only  from  its  predecessor  to  maintain  a  constant  gap. 

For  the  ease  of  analysis  and  design,  we  take  the  following  simplifying  assump¬ 
tions: 


1.  Every  vehicle  is  a  fully  actuated  point  mass  without  damping  (i.e.,  a  double 
integrator) 

2.  Every  vehicle  employs  a  static  gain  feedback  control  law. 


These  simplifications  are  done  for  ease  of  analysis  only;  the  results  are  seen  to  be 
valid  more  generally.  Let  Zj{t)  and  Vi(t )  denote  the  position  and  the  velocity,  re¬ 
spectively,  of  the  ith  vehicle  for  i  =  1,  2, . . . ,  N.  Since  the  inter-connected  platoon 
dynamics  are  of  primary  interest,  a  simple  double  integrator  is  used  to  model  the 
essential  dynamics  of  an  individual  vehicle: 

Zi  =  Ui, 

where  Ut  is  the  control  (engine  torque)  applied  on  the  ith  vehicle.  Formally,  such 
a  model  arises  after  the  velocity  dependent  drag  and  other  non-linear  terms  have 
been  eliminated  by  using  feedback  linearization  [145,  146].  The  control  objective 
is  to  maintain  a  constant  inter-vehicular  distance  A  and  a  constant  velocity  Vd 
for  every  vehicle. 


To  facilitate  the  analysis,  consider  a  co-ordinate  change 


o  ~  Vdt  I  -L . 

Vi  =  2vr( - - - ), 


Vi  =  2n 


Vt-Vd 

L  ’ 


(8.1) 
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Scenario 

Length  L 

Leader 

Follower 

I 

(N  +  1)A 

o 

O 

vn+i  =  0 

II 

NA 

o 

O 

— 

Table  8.1.  The  two  scenarios  -  one  with  a  fictitious  lead  vehicle,  the  other  with 
fictitious  lead  and  follow  vehicles. 

where  L  denotes  the  platoon  length,  which  equals  (N  +  1)A  in  scenario  I  and 
N A  in  scenario  II.  Figure  8.1(b)  depicts  the  schematic  of  the  platoon  in  the  new 
co-ordinates.  The  normalization  ensures  that  yo(t)  =  27 r,  y,;(f)  e  [0, 27r],  and 
VN+i(t)  =  0  ( yN(t )  =  0)  in  scenario  I  (II).  Here,  we  have  implicitly  assumed 
that  deviations  of  the  vehicle  positions  and  velocities  from  their  desired  values  are 
small. 

In  the  normalized  co-ordinate,  the  dynamics  of  the  ith  vehicle  is  described  by 


Vi  ^ ii 


where  iq  :=  2nUi/L.  The  desired  spacing  and  velocities  are 


5  :  = 


A 


Vd-Vd  n 

Vd  ■=  T  ln  =  0, 


L/  2tt’  ^a'  L/2n 
and  the  desired  position  of  the  ith  vehicle  is 


Vi  (t)  =  2tt-  id. 


(8.2) 


The  position  and  velocity  errors  for  the  ith  vehicle  in  the  y  co-ordinate  are: 


Vi(t)  =  yi(t ) 


Vi  =Vi-Vd  =  Vi. 


We  note  that  Vq  =  Vn+i  =  0  for  the  fictitious  lead  and  follow  vehicles. 
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For  the  purposes  of  control,  it  is  useful  to  introduce  the  front  and  back  relative 


(position)  errors  for  the  ith  vehicle: 


-Zi-  A 
L/ 2n 

Zi  —  Zi+ 1  —  A 

L/27T 


=  Hi  1  -  Vi  ~  5, 
=  Vi  -  Vi+i  -  8, 


for  i  =  {1, . . . ,  N}. 


The  quantity  denotes  the  front  relative  position  error  between  the  ith  and 
its  predecessor  (i  —  1)  vehicle,  and  ef'1  denotes  the  back  relative  position  error 
between  the  ith  and  its  follower  (i  +  1)  vehicle.  The  relative  errors,  including  the 
velocity  error,  can  be  obtained  in  practice  by  on-board  devices  such  as  radars,  GPS 
and  speed  sensors.  Consistent  with  the  decentralized  bidirectional  linear  control 
architecture,  the  control  ut  for  the  ith  vehicle  is  assumed  to  depend  only  on  1)  its 
velocity  A,  and  2)  the  relative  position  errors  between  itself  and  its  immediate 
neighbors.  That  is, 


Ui 


£,(/)  g (/)  _  />)  e(&)  -  biVi , 


(8.3) 


where  f  bl  are  positive  constants.  The  first  two  terms  are  used  to  compensate 
for  any  deviation  away  from  nominal  with  the  predecessor  and  the  follower  ve¬ 
hicles  respectively.  The  third  term  is  used  to  obtain  a  zero  steady-state  error  in 
velocity.  In  principle,  relative  velocity  errors  between  neighboring  vehicles  can 
also  be  incorporated  into  the  control,  but  we  do  not  examine  this  situation  here. 


Equation  (8.3)  represents  control  using  state  feedback,  albeit  only  with  local 
(nearest  neighbor)  information.  Analysis  of  this  controller  structure  is  relevant 
even  if  there  are  additional  dynamic  elements  in  the  controller.  First,  a  dynamic 
controller  cannot  be  allowed  to  have  a  zero  at  the  origin.  The  reason  is  that  for  a 
constant  velocity  reference,  such  a  pole-zero  cancellation  will  lead  to  steady-state 
errors  that  grow  without  bound  as  N  increases  (see  Theorem  7.3.2  in  the  previous 
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chapter).  Second,  a  dynamic  controller  cannot  have  an  integrator  either.  For  if 
it  does,  the  closed-loop  platoon  dynamics  become  unstable  for  a  sufficiently  large 
value  of  N( see  Theorem  7.3.1).  As  a  result,  any  allowable  dynamic  compensator 
must  essentially  act  as  a  static  gain  at  low  frequencies.  Furthermore,  the  results 
of  [160]  indicate  that  the  principal  challenge  in  controlling  large  platoons  arises 
from  the  double  integrator  with  its  unbounded  gain  and  large  negative  phase  at 
low  frequencies  (see  Section  7.3.5).  Hence,  the  limitation  and  its  amelioration 
discussed  here  only  with  the  local  state  feedback  of  (8.3)  is  also  relevant  to  the 
case  where  additional  dynamic  elements  appear  in  the  control. 


To  describe  the  closed-loop  dynamics  of  the  platoon,  define 

y  :=  [iji,y2,  ■■■,& n]t, 

v  :=  [vi, . . .  ,vN]T . 


For  scenario  I  with  fictitious  lead  and  follow  vehicles,  the  control  law  (8.3)  yields 
the  following  closed  loop  dynamics. 


_ 

0  / 

V 

-K[f)MT  -  K[b)M  -B 

L  J 

A-l-f 


y 

V 


(8.4) 


where  K[f)  =  diag (k^\  k!/\  . . . ,  kffl),  =  diag (kf\  kf  \  ■  ■  ■ ,  k$),  B  =  diag(f>i,  b2,  ■ 
and 


r  i  -i  o  ... 
0  1-1 


M  = 


o 

i  -l 

...  o  1  J 


For  scenario  11  with  a  fictitious  lead  vehicle  and  no  follow  vehicle,  the  closed  loop 


•  • ,  fejv) 
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dynamics  are 


0  / 

V 

-  K^M0  -B 

L  -1 

Al 


y 

V 


where  K\P  =  K^\ 


diag (k[b\  k^,  • . . ,  k{^_v  0),  and 


M0 


rl-10 

01-1 


0 

1  -1 
0  0  J 


(8.5) 


Onr  goal  is  to  understand  the  progressive  loss  of  closed  loop  stability  with 
increasing  N  and  to  devise  ways  to  ameliorate  such  a  loss  by  appropriately  choos¬ 
ing  the  controller  gains.  While  in  principle  this  can  be  done  by  analyzing  the 
eigenvalues  of  the  matrix  Al-f  (scenario  I)  and  of  Al  (scenario  II),  we  take  an 
alternate  route.  When  the  number  of  vehicles  N  is  large,  we  approximate  the 
dynamics  of  the  discrete  platoon  by  a  partial  differential  equation  (PDE)  which 
is  used  for  analysis  and  control  design. 


8.4  Continuous  model  of  an  automated  platoon 

In  this  section,  we  develop  a  continuous  PDE  approximation  of  the  (spatially) 
discrete  platoon  dynamics  with  bidirectional  control.  The  PDE  is  derived  with 
respect  to  a  normalized  spatial  co-ordinate  x  €  [0,  27t].  We  recall  that  the  nor¬ 
malized  location  of  the  ith  vehicle  (denoted  as  yi)  too  was  defined  with  respect 
to  this  co-ordinate  system.  In  effect,  the  two  symbols  x  and  y  correspond  to  the 
same  co-ordinate  representation  but  are  used  here  to  distinguish  the  continuous 
and  discrete  formulations. 

With  respect  to  the  normalized  co-ordinate,  every  car  is  nominally  assumed  to 
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lie  within  an  interval  of  length  5  (see  Fig.  8.1(b)).  For  the  purposes  of  continuous 
approximation,  we  smear  each  vehicle  over  its  interval  to  get  a  constant  mean 
density 


P  o 


N 


1 

6 


(8.6) 


for  N  vehicles  in  the  platoon.  Dynamics  of  the  individual  vehicles  in  the  pla¬ 
toon  create  perturbations  in  the  density,  with  the  local  density  p(x,  t )  increasing 
(decreasing)  as  the  cars  move  closer  (apart).  The  starting  point  of  macroscopic 
continuous  models  of  traffic  flow  thus  is  the  continuity  equation,  which  relates 
the  density  p(x,  t )  (vehicles  per  unit  characteristic  length)  at  spatial  co-ordinate 
x  G  [0,  27r]  and  time  t  G  [0,  oo)  with  the  velocity  v(x,  t ): 

dp_  d{pv) 
dt  dx 


In  order  to  analyze  small  perturbations  about  the  mean,  we  define  the  per¬ 
turbed  quantities  p,  v  by  the  relations 

p(x,  t)  =  p0  +  p(x,  t),  v(x,  t)  —  0  +  v(x,  t), 


where  the  mean  velocity  is  zero  because  of  our  choice  of  the  co-ordinate  system 
(see  (8.1)).  Even  though  v  and  v  are  the  same,  we  use  v  to  draw  attention  to 
the  fact  that  the  velocity  is  a  small  perturbation  of  the  mean  value.  For  such 
perturbations,  the  linearized  continuity  equation  is  given  by 


dp  dv  dv  1  dp 

dt  dx  ^  dx  p0  dt 


(8.7) 


This  equation  is  consistent  with  the  physical  intuition  whereby  a  positive  gradient 
in  velocity  (due  to  say  the  predecessor  speeding  up  or  the  follower  slowing  down) 
will  cause  the  local  density  to  decrease.  In  order  to  study  density  perturbations, 
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one  thus  needs  to  specify  the  velocity  which  here  arises  due  to  the  linearized 
momentum  balance: 


—  =  F(x,  t)  =>  —  =  u(x,  t),  (8.8) 

where  F(x,t )  is  the  acceleration  due  to  control  u(x,t )  and  possibly  disturbance. 
Here,  we  focus  only  on  the  control.  Since  we  are  in  the  moving  coordinate  frame, 
the  momentum  equation  has  the  partial  derivative  rather  than  the  usual  total 
derivative  on  the  left  hand  side.  Using  (8.3),  the  control  for  the  ith  vehicle  in  the 
platoon  is  of  the  form: 

Ui(t)  =  u{pf\t)  -  u\pb\t )  +u\v\t), 

v - V - ' 

where 

u(pf)(t )  :=  k\f) —  yiit)  —  5),  u(pb\t)  :=  kf\yi{t )  -  yi+1(t)  -  5), 

are  the  position  dependent  / ront  and  back  control  terms,  and, 

u[v\t)  :=  —bvi(t).  (8.9) 


Corresponding  to  this  discrete  control  law,  we  derive  a  continuous  approximation 


u(x,  t )  =  u^\x,  t )  —  u,(pb\x,  t )  +u(-v\x,  t ), 


(8.10) 


(x,t) 


such  that  t)  =  u\p^\t),  u^pb\yi}t)  =  ufb\t ),  and  u(v\yi,t)  =  uf\t).  Now, 


u(pf)  _  -Vi-5)  =  k\J>(  1  - 


—  u(fh 


5 


Vi- i  -  Vi 


-){yi-i-yi) 


~  f  kpf(x)(l  -  ^^-)dx, 

■'  Vi 


Po 


where  the  approximation  is  obtained  by  smearing  the  control  action  over  the 

/  r\ 

interval  [yt,  yi-\]  and  substituting  kf(x )  for  the  discrete  control  gain  kf  .  Since 
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P  =  Po  +  Pi  we  have 

,  n  1  r^-i  1 

ui  - /  kf(x)p(x,t)dx  = - [fc/  p](a;+,f)<5, 

Po  JVi  Po 

by  the  Mean  Value  Theorem,  where  aU  G  [2/i,  2/i— 1]  -  Since  uf^\t)  =  u(pkHyi,  t) 
and  <5  =  1/po  (see  (8.6)),  we  take 

u<'pf\x,t)  =  — 2^/  p](^+V)>  and  u^pb\x,t)  =  — p](£_,^)> 

Po  Po 

where  x+  G  [?/»,  2/»— 1]  and  x~  G  [t/i+i,^].  Using  (8.10),  we  have 


u^p\x,t)  =  vSpf\x,t)  —  u^pb\x,t) 

1 


Po 


([kf  p}(x+,t) -[kb  p\(x  ,£)). 


In  order  to  specify  the  control,  one  thus  needs  to  approximate  the  terms  on  the 
right  hand  side  as  functions  of  (x,t).  For  a  small  perturbation  about  a  nominally 
symmetric  bi-directional  architecture,  a  valid  approximation  is  obtained  by  taking 


x "r  —  x~ 


5  which  yields 
u^p\x,t)  ; 


Po 


}p]0M)  +  ~ OM) 


dx 


9^  ]P~  V3T-(fc(+)p)» 
Po  2Po  ^ 


(8.11) 


where 


k^  \=  kf(x)  +  kb(x),  k ^  ]  \=  kf(x)  —  kb{x).  (8.12) 


The  velocity  feedback  term  in  (8.9)  has  a  continuous  counterpart 

u^v\x,t)  =  —b(x)v(x,t). 


(8.13) 


With  the  feedback  control  u(x,t)  =  u^p\x,t)  +  u^v\x,t),  where  u^p\x,t)  and 
u^v\x,t)  are  given  by  (8.11)  and  (8.13),  the  linearized  momentum  equation  (8.8) 
becomes 


dv 

dt 


) p  + 


J_d_ 

2pg  dx 


(p/c(+))  +  bv 
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Upon  differentiating  both  sides  with  respect  to  t  and  using  the  continuity  equa¬ 
tion  (8.7)  we  obtain  the  PDE  that  describes  small  velocity  perturbations  v  due  to 
the  inter-connected  platoon  dynamics: 


21  +  dU  a  =  !«<-)—)  + 

dt2+bdt)  f,„  r  dx’  + 


2pp  dx  ^  p  dx 


(8.14) 


The  boundary  conditions  for  the  PDE  depend  upon  the  dynamics  of  the  first 
and  the  last  vehicles  in  the  platoon.  For  scenario  I  with  a  constant  velocity  ficti¬ 
tious  and  lead  vehicles,  the  appropriate  boundary  conditions  are  of  the  Dirichlet 
type  on  both  ends: 


7(0,  t)  =  v(2n,  t)  —  0.  Vf  e  [0,  oo)  (8.15) 

For  scenario  II  with  the  only  a  fictitious  lead  vehicle,  the  appropriate  boundary 
conditions  are  of  Neumann-Dirichlet  type: 

dv 

— —  (0,  t)  —  v(2n,  t)  —  0.  Vi  G  [0,  oo)  (8.16) 

(/X 


8.4.1  Eigenvalue  comparison 


For  preliminary  validation  purposes,  we  consider  the  simplest  case  where  the 
position  control  gains  are  constant  for  every  vehicle,  i.e.,  kf(x)  =  kf,(x)  =  k0  and 
b(x)  =  b0.  In  such  a  case  k^\x)  =  0,  k^+\x)  =  2 k0  and  the  governing  PDE  (8.14) 
simplihes  to 


d2  ,  d 


pi  dx 2 


7  =  0 


(8.17) 


Note  that  this  is  a  damped  wave  equation  with  a  wave  speed  of  The  wave 
equation  is  consistent  with  the  physical  intuition  that  a  symmetric  bidirectional 
control  architecture  causes  a  disturbance  to  propagated  equally  in  both  directions. 
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Figure  8.2  compares  the  closed  loop  eigenvalues  of  a  discrete  platoon  with 
N  =  25  vehicles  and  the  PDE  (8.17).  The  eigenvalues  of  the  platoon  are  obtained 
by  numerically  evaluating  the  eigenvalues  of  the  matrices  Ap_p  and  Ap  (defined  in 
(8.4)  and  (8.5)).  The  eigenvalues  of  the  PDE  are  also  computed  numerically  after 
using  a  Galerkin  method  [170].  The  figure  shows  that  the  two  sets  of  eigenvalues 
are  in  good  match.  In  particular,  the  least  stable  eigenvalues  are  well-captured 
by  the  PDE.  Additional  validation  appears  in  the  following  sections,  where  we 
present  and  compare  results  for  analysis  and  control  design. 


8.5  Loss  of  stability  margin  with  symmetric  bidi¬ 
rectional  control 

In  this  section,  we  analyze  the  stability  of  a  discrete  platoon  by  evaluating  the 
eigenvalues  of  the  PDE 


82  8  82 

>8  =  °- 


(8.18) 


where  x  E  [0,  27t],  do  =  ^  is  the  mean  density  and 

2  fco 
ao  •—  2 
do 


(8.19) 


is  the  wave  speed.  The  PDE  corresponds  to  the  platoon  with  symmetric  and 
constant  control  gains:  kf(x)  =  k\,{x)  =  ko  and  b(x)  =  b0.  On  taking  the  Laplace 
transform,  one  obtains  the  characteristic  equation 


s  T  bos  —  a0A  —  0, 

where  A  is  an  eigenvalue  of  the  Laplacian,  i.e., 

d2i] 


(8.20) 


dx2 


=  Ar)(x), 


(8.21) 
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Figure  8.2.  Comparison  of  closed  loop  eigenvalues  of  the  platoon  dynamics  and  the 
spectrum  of  the  corresponding  PDE  (8.18)  for  the  two  different  scenarios:  (a)  pla¬ 
toon  with  fictitious  lead  and  follow  vehicles,  and  correspondingly  the  PDE  (8.18) 
with  Dirichlet  boundary  conditions,  (b)  platoon  with  fictitious  lead  vehicle,  and 
correspondingly  the  PDE  (8.18)  with  Neumann-Dirichlet  boundary  conditions. 
For  ease  of  comparison,  only  a  few  of  the  eigenvalues  are  shown.  Both  plots  are 
for  N  —  25  vehicles;  the  controller  parameters  are  k\^  =  k!f}  =  1  and  5*  =  0.5  for 
%  —  1,  2, . . . ,  N,  and  for  the  PDE  kf(x )  =  1  and  b(x )  =  0.5. 


267 


boundary  condition 

eigenvalue  A 1 

eigenfunction  ifi(x) 

l 

77(0)  =  77(271-)  =  0 
(Dirichlet  -  Dirich¬ 
let) 

i2 

4 

sin(y) 

1  =  1,2,... 

g(0)  =  ,(20  =  0 

(Neumann  -  Dirich¬ 

(27— l)2 

Z  =  1,2, ... 

let) 

16 

COS(  4  ) 

Table  8.2.  The  eigen- solutions  for  the  Laplacian  with  two  different  boundary  con¬ 
ditions. 

and  rj  is  an  eigenfunction  satisfying  appropriate  boundary  conditions  -  (8.15)  for 
scenario  I  and  (8.16)  for  scenario  II.  The  eigen-solutions  for  the  two  scenarios  are 
given  by  the  following  simple  Lemma. 

Lemma  8.5.1.  Consider  the  eigenvalue  problem  (8.21)  for  the  Laplacian  with 
boundary  conditions  (8.15)  and  (8.16)  corresponding  to  the  scenarios  I  and  II 
respectively.  The  eigenvalues  and  the  eigenfunctions  for  the  two  scenarios  are 
given  in  the  Table  8.2.  The  eigenfunctions  for  either  scenario  provide  a  basis  of 
L2([  0,2tt]).  □ 


Proof.  It  is  a  simple  calculation  to  verify  that  the  eigenvalues  and  eigenfunctions 
given  in  the  table  satisfy  the  eigenvalue  problem.  Any  eigenfunctions  of  the  Lapla¬ 
cian  on  [0,  27 r]  are  known  to  provide  a  basis  for  L2([0,  2tt\)  [171].  ■ 


To  see  the  effect  of  N  on  stability,  we  evaluate  the  eigenvalues  (roots  of  the 
characteristic  equation  (8.20))  for  the  Dirichlet  boundary  conditions  (scenario  I). 
Using  Table  8.2,  the  Ith  eigenvalue  is  given  by 

-b0  ±  v^o 


si  = 


a20l 2 


(8.22) 
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(a)  Eigenvalues  move  toward  (b)  Mistuning  “exchanges” 

zero  with  increasing  N.  stability  between  sf  and  sy. 

Figure  8.3.  A  schematic  explaining  the  loss  of  stability  as  N  increases  and  how 
mistuning  ameliorates  this  loss. 

where  l  =  1,  2, . . ..  The  real  part  of  the  eigenvalue  depends  upon  the  discriminant 
D(l,N )  =  (6q  —  aft2),  where  the  wave  speed  a0  depends  both  on  control  gain  k0 
and  number  of  vehicles  N  (see  (8.19)).  For  a  fixed  control  gain,  there  are  two 
cases  to  consider: 

1.  If  D(l,  N )  <  0,  the  roots  sf  are  complex  with  the  real  part  given  by  —  60, 

2.  If  D(l,  N )  >  0,  the  roots  sf1  are  real  with  s f  +  s~[  =  —260- 

In  the  former  case,  the  damping  is  determined  by  the  velocity  feedback  term  £>0  ^  ? 
while  in  the  latter  case  one  eigenvalue  (s~[)  gains  damping  at  the  expense  of  the 
other  (s+)  which  looses  damping.  When  sf  are  real,  the  eigenvalue  sz+  is  closer 
to  the  origin  than  so  we  call  sf  the  /th  less-stable  eigenvalue.  The  following 
lemma  gives  the  dependence  of  this  eigenvalue  on  the  number  of  vehicles  N. 

Lemma  8.5.2.  Consider  the  eigenvalue  problem  for  the  linear  PDE  (8.18)  with 
boundary  conditions  (8.15)  and  (8.16),  corresponding  to  scenarios  I  and  II  respec- 
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boundary  condition 

sz+  for  l  <  lc 

lc 

Dirichlet-Dirichlet 

7T2fc0  l2  1  1  \ 

b0N 

b0  N2  “T  'W/vU 

2-ny/ko 

Neumann- Dirichlet 

7r2fco  l2  |  l  \ 

bo  N 

4b0  N2  ~r  A/U 

2n\/rko 

Tabic  8.3.  the  trend  of  the  less  stable  eigenvalue  s;+  for  the  PDE  (8.18) 

tively.  The  Ith  less-stable  eigenvalue  sf  approaches  0  as  0{1/N2)  in  the  limit  as 
N  — >  oo.  The  asymptotic  formulas  appear  in  Table  8.3.  □ 


Proof.  We  first  consider  scenario  1  with  Dirichlet  boundary  conditions  (8.15).  Us¬ 
ing  (8.22)  and  (8.19), 


2 sf  =  -b0  ±  b0 


2/2  \  V2 


afl 


bl 


=  -bn  ±  bn 


2ir2ko  l2 
b2  TV2 


+  0{ 


N4' 


for  Og/2/6g  <C  1.  The  asymptotic  formula  holds  for  wave  numbers 


l  «  —  =  b°N 
a0  2n\[ko 


•  l  Cl 


(8.23) 


and  in  particular  for  each  l  as  N  — »  oo.  The  proof  for  the  scenario  11  with 
Neumann- Dirichlet  boundary  conditions  (8.16)  follows  similarly.  ■ 


Figure  8.3(a)  graphically  illustrates  the  destabilization  by  depicting  the  move¬ 
ment  of  eigenvalues  sf  as  N  increases.  For  sufficiently  small  values  of  N,  the 
discriminant  D(1,N)  is  negative  and  the  eigenvalue  sf  are  complex.  The  real 
part  of  the  eigenvalue  depends  only  on  the  value  of  b0.  At  a  critical  value  of 
N  =  Nc  :=  7T'^° ,  the  discriminant  becomes  zero,  =  sj"  and  the  eigenvalues 
collide  on  the  real  axis.  For  values  of  N  >  Nc  and  in  particular  as  N  — >  oo,  the 
eigenvalue  s)1"  asymptotes  to  0  while  staying  real,  and  sj"  asymptotes  to  —b.  Their 
cumulative  damping,  as  reflected  in  the  sum  s+  +  sf  =  —2 b0,  is  conserved.  In 


270 


other  words,  is  destabilized  at  the  expense  of  .  The  Lemma  shows  that  the 
least  stable  eigenvalue  admits  an  asymptotic  expansion 

1 


s+  =  (Dirichlet-Dirichlct) 

=  — — +  0(— — )  (Neumann-Dirichlet) 
40q  1 V  z  i  V  ^ 


(8.24) 

(8.25) 


as  IV  — >  cx).  Therefore,  for  large  values  of  N,  the  least-stable  eigenvalue  asymptotes 
to  zero  as  0{1/N2). 


Finally,  we  present  numerical  computations  that  corroborates  this  PDE-based 
analysis.  Figure  8.4  plots  as  a  function  of  N  the  least  stable  eigenvalue  of  the 
PDE  and  of  the  discrete  platoon  as  well  as  the  prediction  from  the  asymptotic 
formula.  The  eigenvalues  for  the  discrete  platoon  are  obtained  by  numerically 
evaluating  the  eigenvalues  of  the  matrices  Al-f  and  Al  (see  (8.4)  and  (8.5))  with 
constant  control  gains  k\ ^  =  kf ^  —  ko  —  1  and  R  =  b0  =  0.5  for  %  =  1, . . . ,  N. 
The  comparison  shows  that  the  PDE  analysis  closely  matches  the  eigenvalue  of 
the  discrete  platoon. 

Remark  8.5.1.  :  The  preceding  analysis  shows  that  the  loss  of  stability  experienced 
with  a  symmetric  bidirectional  architecture  is  controller  independent.  The  least 
stable  eigenvalue  approaches  0  as  0(1/ N2)  irrespective  of  the  values  of  the  gains 
ko  and  b0,  as  long  as  they  are  fixed  constants  independent  of  N.  Equations  (8.24) 
and  (8.25)  also  imply  that  for  the  least  stable  eigenvalue  to  be  uniformly  bounded 
away  from  0,  one  has  to  increase  the  control  gain  ko  as  N2.  This  is  consistent 
with  the  conclusion  of  Jovanovic  et.  al,  who  studied  the  LQR  control  problem  of 
a  platoon  on  a  circle  [158] . 

Remark  8.5.2.  :  The  result  of  Lemma  8.5.2  also  indicates  that  the  platoon  with 
both  a  fictitious  leader  and  a  follower  (scenario  I)  has  a  higher  stability  margin 
than  platoon  with  only  a  fictitious  leader  (scenario  II).  We  note,  however,  that  in 
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N 

Figure  8.4.  Comparison  of  the  least  stable  eigenvalue  of  the  closed  loop  pla¬ 
toon  dynamics  and  that  predicted  by  Lemma  8.5.2  with  symmetric  bidirectional 
control.  In  the  plot  legends,  “D-D”  stands  for  “Dirichlet-Dirichlet” ,  “N-D”  for 
“Neumann-Dirichlet”,  “L-F”  for  fictitious  leader- follower,  and  “L”  for  fictitious 
leader.  The  plot  for  “PDE  (8.18),  D-D”  should  be  compared  with  “platoon,  L-F” 
since  they  both  correspond  to  scenario  I.  Similarly,  “PDE  (8.18),  N-D”  and  “pla¬ 
toon,  L-F”  correspond  to  scenario  II.  Note  that  the  predictions  (8.24)  and  (8.25) 
are  valid  for  1  <<  lc  (defined  in  (8.23)),  which  in  this  case  means  for  N  »  12. 
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scenario  I,  the  absolute  position  of  the  fictitious  follow  vehicle  must  be  provided 
to  the  last  vehicle  in  the  platoon.  Therefore  scenario  I  requires  one  extra  piece  of 
global  information  as  compared  to  the  scenario  II.  The  result  provides  a  numerical 
measure  of  the  benefit  of  this  extra  information  -  a  factor  of  4  improvement  in 
the  closed-loop  damping. 


8.6  Reducing  loss  of  stability  by  mistuning 

With  symmetric  bidirectional  control,  kp  \x)  =  0,  so  the  only  term  left  in 
the  right  hand  side  of  the  governing  PDE  (8.14)  is  a  0(^ 2)  term.  This  explains 
the  decay  of  the  least  stable  eigenvalue  as  1/N2.  Any  amount  of  asymmetry 
between  the  front  and  the  back  gain  functions  kf(s)  and  kb{x)  will  make  kp  \x) 
not  identically  zero,  so  the  right  hand  side  will  contain  a  0(A)  term.  This  gives 
us  hope  that  the  least  stable  eigenvalue  might  decay  less  slowly  in  the  presence  of 
such  asymmetry,  no  matter  how  small.  We  will  now  show  that  this  is  indeed  the 
case,  and  determine  the  gain  profiles  that  achieves  this  slower  rate  of  decay. 

We  consider  the  eigenvalue  problem  for  the  PDE  (8.14)  where  the  control 
gains  are  designed  (mistimed)  with  the  objective  of  minimizing  the  least-stable 
eigenvalue  sj1".  In  particular,  we  consider  forward  and  backward  position  feedback 
gain  profiles: 


kf(x)  =  k0  +  ekf(x), 
kb(x)  =  k0  +  ekb(x), 

where  e  >  0  is  a  small  parameter  signifying  the  amount  of  mistuning  and  k^(x), 
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k%(x)  denote  the  perturbation  profiles.  Define 


ks(x)  :=  kaf(x)  +  k%(x), 
km(x)  :=  kaf(x )  -  kg(x), 

so  that  from  (8.12), 

kp+)(x)  =  2  k0  +  eks(x), 
k{p~\x)  =  ekm(x). 

The  mistuned  version  of  the  PDE  (8.14)  is  then  given  by 

d2v  dv 

=  <8’26) 

where 

cP“v 

In  the  remainder  of  this  chapter,  we  study  the  problem  of  optimizing  the  stability 
margins  by  judicious  choice  of  km(x)  and  ks(x).  In  effect,  the  results  of  our 
investigation,  carried  out  in  the  following  two  sections  using  perturbation  and 
optimization  methods,  provide  a  systematic  framework  for  designing  control  gains 
in  the  discrete  platoon. 

8.6.1  Perturbation  analysis 

The  control  objective  is  to  design  mistiming  profiles  km(x)  and  ks(x)  to  min¬ 
imize  the  least  stable  eigenvalue  sj1".  To  achieve  this,  we  first  use  a  perturbation 
method,  borrowed  from  [168],  to  obtain  an  explicit  asymptotic  formula  for  the 
eigenvalues. 


km_dv_  _1 _ 

Po  dx  2pg  dx 


d  ,,  dv , 
~(k’Yx' 


(8.27) 
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Theorem  8.6.1.  Consider  the  eigenvalue  problem  for  the  mistuned  PDE  (8.26) 
with  Dirichlet  boundary  condition  (8.15)  corresponding  to  scenario  I.  The  Ith 
eigenvalue  pair  is  given  by  the  asymptotic  formula 

l  f2'n  1 

4(e)  =  J  km(x)sin(lx)dx  +  0(e2)+0(—), 

l  f2lT  1 

sT(e)  =  ~2b°~e^N  Jo  km(x)sm(lx)dx  +  0(e2)  +  0(—)} 

that  is  valid  for  each  l  in  the  limit  as  e  — »  0  and  N  — >  oo.  □ 


The  perturbation  formula  is  useful  because  it  suggests  the  most  beneficial 
mistiming  profile  km(x)  in  the  limit  e  — >  0,  which  is  summarized  in  the  next 
corollary. 


Corollary  8.6.1.  Consider  the  problem  of  minimizing  the  least-stable  eigenvalue 
of  the  PDE  (8.26)  with  Dirichlet  boundary  condition  (8.15)  by  choosing  a  function 


km{x )  G  L2([0,27r])  with  norm- constraint  f0  71  \km{x)\2dx  =  1.  In  the  limit  as 


e  — >  0,  the  optimal  mistuning  profile  is  given  by  km[x )  =  _ 
profile,  the  least  stable  eigenvalue  is  given  by  the  asymptotic  formula 


With  this 


^(e)“"l£tiC0(e2)  +  0(X5) 


in  the  limit  as  e  — >  0  and  N  — >  oo. 


□ 


This  result  shows  that  even  with  an  arbitrarily  small  amount  of  mistuning  e, 
one  can  improve  the  closed-loop  platoon  damping  by  a  large  amount,  especially 
for  large  values  of  N.  The  least-stable  eigenvalue  s]1"  asymptotes  to  0  as  O(j^)  in 
the  mistuned  case  as  opposed  to  0(^ )  in  the  nominal  case. 

Figure  8.3(b)  graphically  illustrates  the  mechanism  by  which  mistuning  af¬ 
fects  the  movement  of  eigenvalues  as  N  increases.  By  properly  choosing  the 
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mistiming  patterns  km(x)  and  ks(x),  the  damping  can  be  exchanged  between  the 
eigenvalues  s]1-  and  6%  so  that  the  less  stable  eigenvalue  “gains”  stability  at 
the  expense  of  the  more  stable  eigenvalue  S'7-  The  net  amount  of  damping  is 
preserved,  since  +  s]”  =  0  (as  seen  from  Theorem  8.6.1). 

Figure  8.5  presents  a  validation  of  these  results  by  comparing  the  numerically 
obtained  mistimed  and  nominal  eigenvalues  for  both  the  discrete  platoon  and  the 
PDE;  the  parameter  values  are  indicated  in  the  figure  caption.  The  figure  shows 
that 

1.  the  platoon  eigenvalues  match  the  PDE  eigenvalues  accurately  over  a  range 
of  N,  and 

2.  the  mistimed  eigenvalues  show  an  order  of  magnitude  improvement  over 
the  nominal  case  even  though  the  controller  gains  differ  from  their  nominal 
values  only  by  ±10%. 

For  comparison,  the  figure  also  depicts  the  asymptotic  eigenvalue  formula  given 
in  Corollary  8.6.1.  Similar  results  are  also  obtained  for  scenario  If,  which  are 
summarized  by  the  next  theorem. 

Theorem  8.6.2.  Consider  the  eigenvalue  problem  for  the  mistuned  PDE  (8.26) 
with  Neumann- Dirichlet  boundary  condition  (8.16)  corresponding  to  scenario  II. 
The  Ith  eigenvalue  pair  is  given  by  the  asymptotic  formula 

/  I2n  It  1 

4(e)  =  Jo  km(x)sin{— )dx  +  0(e2)  +  0(—), 

l  f'27T  It  1 

Sj  (e)  =  -2%  +  J I  km(x)  sin (—)dx  +  0(e2)  ±  0(— ), 

that  is  valid  for  each  l  in  the  limit  as  e  — >  0  and  N  — >  oo.  □ 
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Figure  8.5.  The  least  stable  eigenvalue  of  the  closed  loop  platoon  (i.e.,  of  Al-f 
in  (8.4))  and  of  the  PDE  (8.26)  with  Dirichlet  boundary  conditions,  with  and 
without  mistiming,  for  a  range  of  values  of  N.  Parameters  for  the  nominal  case 
are  k0  =  1  and  b0  =  0.5.  In  the  mistimed  case,  forward  and  backward  controller 
gains  are  chosen  as  kf  =  k0  —  0.1sin(x),  kb  =  k0  +  0.1sin(x)  (i.e.,  km{x)  = 
—  sin(a;)/v/7r,  ks(x)  =  0  and  e  =  0.2y/7r).  For  the  platoon,  controller  gains  are 
chosen  by  sampling  the  gains  for  the  PDE  as  k\^  =  1  —  0.1sin(i/f),  kf'*  =  1  + 
0.1  sin (yf),  where  yf  defined  in  (8.2)  is  the  desired  position  of  the  zth  vehicle.  The 
legend  “Corollary  8.6.1”  refers  to  the  prediction  by  Corollary  8.6.1.  Note  that  the 
prediction  of  Corollary  8.6.1  is  plotted  only  for  IV  >  16  to  ensure  that  1  <<  lc 
(see  (8.23)). 
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PDE,  nominal 
o  PDE,  mistimed 
o  platoon,  nominal 
A  platoon,  mistimed 
Corollary  (8.6.1) 
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As  with  scenario  I,  here  again  we  use  the  above  result  to  determine  the  most 


beneficial  profile  km{x)  for  small  e: 


Corollary  8.6.2.  Consider  the  problem  of  minimizing  the  least-stable  eigenvalue 
of  the  PDE  (8.26)  with  Neumann- Dirichlet  boundary  conditions  (8.16)  by  choosing 
a  function  km(x )  G  L2([0,27r])  with  norm- constraint  JQ27r  \km(x)\2dx  =  1.  In  the 
limit  as  e  ^  0,  the  optimal  mistuning  profile  is  given  by  km(x)  =  4^sin(|).  With 
this  profile,  the  least-stable  eigenvalue  is  given  by  the  asymptotic  formula 


e  = 


■fv+0<£2>+0<^ 


in  the  limit  as  e  — >  0  and  N  — >  oo. 


□ 


Eigenvalue  trends  together  with  their  validation  for  scenario  11  appear  in 
Fig.  8.6. 

The  proof  of  Theorem  8.6.1  is  presented  below.  The  proof  of  Theorem  8.6.2  is 
analogous  and  is  therefore  omitted. 


Proof  of  Theorem  8.6.1.  The  spatial  inhomogeneity  introduced  by  the  ^-dependent 
coefficients  km(x)  and  ks(x)  destroy  the  spatial  invariance  of  the  nominal  PDE  (8.18). 
Hence,  the  Fourier  basis  -  eigenfunctions  of  the  Laplacian  -  no  longer  lead  to 
a  diagonalization  of  the  mistimed  PDE.  The  methods  of  section  8.5  thus  need 
to  be  suitably  modified.  In  order  to  compute  the  eigenvalues  for  the  mistimed 
PDE  (8.26),  we  take  a  Laplace  transform  of  (8.26)  and  get 


}2„ 
L°dx2 


2 2  / 

+  s  1 7  +  b0sri  =  e 


kmdjl  J- _ d_(k  df]_s 

[  po  9x  2 Pq  dx  6  dx 


(8.28) 


We  are  interested  in  eigenvalues  of  (8.28)  with  Dirichlet  boundary  conditions,  i.e., 
the  values  of  s  for  which  a  solution  to  the  homogeneous  PDE  (8.28)  exists  with 
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Figure  8.6.  The  least  stable  eigenvalue  of  the  closed  loop  platoon  in  scenario  II 
(i.e. ,  of  Al  in  (8.4))  and  of  the  PDE  (8.26)  with  Neumann-Dirichlet  b.c.,  with  and 
without  mistiming,  for  a  range  of  values  of  N.  Parameters  for  the  nominal  case  are 
ko  =  1,  bo  =  0.5.  In  the  mistimed  case,  forward  and  backward  controller  gains  are 
chosen  as  kf  =  k0  +  0.1  sin(f),  kb  =  k0  —  0.1  sin(| )  (i.e.,  km  =  ^=sin(|),  ks(x)  =  0 
and  e  =  0.2^/n.  For  the  platoon,  the  gains  are  chosen  as  k\ ^  =  k0  +  0.1  sin(i/Jd/2) 
and  k^  =  k0  —  0.1  sin(i/d/2),  where  yf  defined  in  (8.2)  is  the  desired  position  of  the 
Ith  vehicle.  The  legend  “Corollary  8.6.2”  refers  to  the  prediction  by  Corollary  8.6.2 
of  mistimed  PDE  eigenvalues. 
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nominal  (PDE) 
o  mistimed  (PDE) 
o  nominal  (platoon) 
a  mistimed  (platoon) 
Corollary  (8.6.2) 
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boundary  conditions  77(0)  =  77(277)  =  0.  To  obtain  these  eigenvalues,  we  use  a 
perturbation  method  expressing  the  eigenfunction  and  eigenvalue  in  a  series  form: 


v(x)  =  770(2;)  +  677i  (x)  +  0(e2), 
s  =  r0  +  er\  +  0(e2). 


(8.29) 

(8.30) 


We  note  that  e  r\  denotes  the  perturbation  to  the  nominal  eigenvalue  r0  as  a  result 
of  the  mistiming.  Substituting  (8.30)  in  (8.28)  and  doing  an  0(1)  balance,  we  get 

0(1)  :  -00(770)**  +  r^rjo  +  br0rj0  =  0,  (8.31) 


whose  eigen-solution  is  given  by 

Vo 

r0 


di  sin(— ), 

sf(0), 


(8.32) 

(8.33) 


where  l  —  1,  2, . . .,  di  is  an  arbitrary  real  constant,  and  5^(0)  is  given  by  (8.22). 
Next, 


O(e): 


~a2o(Vi)xx  +  (r-Q  +  br0)rji  = 


km  d7]0  _1 d_  f  dr)o  \ 

p0  dx  2po  dx  \  s  dx  ) 

-  (2r 0r !  +  60ri)77o 


:=  R 


(8.34) 


Substituting  r0  =  5^(0)  on  the  left  hand  side  leads  to  a  resonance  condition  for 
the  right  hand  side  term,  denoted  by  R.  In  particular  for  a  solution  771  to  exist, 
R  must  lie  in  the  range  space  of  the  linear  operator 

+  +  ’  (8.35) 

For  this  self-adjoint  operator,  the  range  space  is  the  complement  of  its  null  space 

(sin(y)}.  This  gives  the  resonance  condition  as 

1  .lx. 

-  <  R,  sin(-)  >=  0, 

7T  Z 
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where  <  •,  •  >  denotes  the  standard  inner  product  in  L2(0,27r).  Explicitly,  this 
leads  to  an  equation 


l  f27r  l 2  f2n  9  lx 

(2r0  +  b0)r1  =  - -  /  km(x)  s'm(lx)dx  +  - — 9  /  ks(x)  cos2(—)dx  (8.36) 

Jo  Jo  ^ 

For  values  of  r0  =  s±(0),  where  s±(0)  is  given  by  (8.22),  the  equation  above 

leads  to  an  expression  for  perturbation  in  the  two  eigenvalues.  We  denote  these 

perturbations  as  rf.  For  r o  =  s£(0),  we  have  from  from  Lemma  (8.5.2)  that 

b0  »  1 270 1  when  l  «  lc,  so  that 


l  f27r  1 

rf  «  - - —  km(x)sm(lx)dx  +  0(— ).  (8.37) 

4vrpoOo  Jo  TV2 

Note  that  we  have  dropped  the  second  integral  on  the  right  hand  side  of  (8.36) 
because  4  =  0(l/iV2)  for  large  N.  For  r0  =  5)7(0),  2r0  ~  —2 b0  for  l  «  lc  and 


r»27r 


- - —  /  km(x)sm(lx)dx  +  0(— ). 

47rpo&o  Jo  N- 


(8.38) 


Note  that 


rf  +  r  i  =  0. 


Putting  the  formulas  for  the  perturbation  to  the  eigenvalues  (8.37)  and  (8.38) 
in  (8.30),  we  get 


4(e)  ~^+(0)  -e 


/ 


-2tt 


4vr60Po 


km(x)  sin (lx)dx  +  0(e2)  +  0(— ), 

iv  z 


s,  (e)  ~  -2b0  +  e 


l 


-2tt 


.  ,  /  km(x)  sm(lx)dx  +  0(e2)  +  0(— r). 

4tt60po  Jo  7W 


Since  s;+(0)  =  O(-^j)  for  l  <  lc  (Lemma  8.5.2)  and  po  =  the  result  follows. 


Remark  8.6.1.  The  asymptotic  formulae  for  sf  in  Corollary  8.6.1  and  Corol¬ 
lary  8.6.2  are  valid  only  in  the  limit  e  — »  0.  However,  one  would  like  to  be  able 
to  use  them  with  somewhat  larger  values  of  e  to  realize  the  benefit  of  mistiming. 
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Figure  8.7.  The  real  parts  of  six  eigenvalues  (closest  to  0)  of  the  closed  loop  platoon 
dynamics  for  Scenario  I,  and  their  comparison  with  the  PDE  eigenvalues  with 
Dirichlet-Dirichlet  boundary  conditions,  for  mistimed  gains  kf(x)  =  1  — O.lsin(x), 
kb(x)  =  1  +  O.lsin(x)  and  b(x)  =  0.5.  As  predicted  by  the  S-L  theory,  the  least 
stable  eigenvalue  stays  the  least  stable,  although  eigenvalues  that  were  more  stable 
merge  with  it  as  N  increases. 

To  do  so,  one  has  to  preclude  the  possibility  of  “eigenvalue  cross-over,”  i.e.,  of  the 
second  (s^-)  or  some  other  marginally  stable  eigenvalue  from  becoming  the  least 
stable  eigenvalue  in  the  presence  of  mistiming.  It  turns  out  that  such  a  cross-over 
is  ruled  out  as  a  consequence  of  the  Strum-Liouville  (S-L)  theory  for  the  elliptic 
boundary  value  problems.  The  standard  argument  relies  on  the  positivity  of  the 
eigenfunction  corresponding  to  ;  the  reader  is  referred  to  [171]  for  the  details. 
Figure  8.7  verifies  this  numerically  by  depicting  the  six  eigenvalues  closest  to  0 
(for  both  the  PDE  and  the  discrete  platoon)  as  a  function  of  N  when  mistiming 
is  applied. 
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8.6.2  Eigenvalue  optimization 


For  relatively  large  values  of  e,  the  mistiming  profiles  obtained  in  the  previous 
section  may  not  be  optimal.  One  therefore  needs  to  find  the  optimal  mistiming 
gain  profiles  that  minimize  the  least  stable  eigenvalue  of  the  PDE  (8.26).  It 
can  be  shown  in  a  straightforward  manner  that  the  least-stable  eigenvalue  of  the 
PDE  (8.26)  has  the  following  property: 

s]1-  — »  — ^ -  as  N  — >  oo,  (8.39) 

O' 0 

where  Xi  is  the  principal  (with  largest  real  part)  eigenvalue  of  L(e)  (defined 
in  (8.27)).  Thus,  in  the  limit  of  large  number  of  vehicles,  the  problem  of  min¬ 
imizing  the  least  stable  eigenvalue  of  the  PDE  (8.26)  is  equivalent  to  minimizing 
Ai,  the  principal  (with  largest  real  part)  eigenvalue  of  L(e)  by  choosing  the  func¬ 
tions  km(s),  ks(x )  G  L2.  By  a  standard  argument  in  S-L  theory,  Ai  is  real  with  a 
positive  eigenfunction  [171]. 

For  problem  of  minimizing  Ai  to  be  well-posed,  an  additional  constraint  on 
krn(x)  and  ks(x)  is  needed.  In  the  following,  we  assume  ks(x)  =  0  and  impose  a 
constraint 

2t r 

\km(x)\2dx  =  1.  (8.40) 

ks(x)  =  0  is  assumed  for  the  sake  of  simplicity  of  the  presentation  and  because 
it  appears  as  a  coefficient  Any  improvement  due  to  ks(x)  alone  is  O(jp) 

while  km(x)  can  potentially  deliver  an  O(T)  shift  in  eigenvalue  location.  This  is 
also  reflected  in  estimates  obtained  using  the  perturbation  methods  (see  (8.36)). 
Thus,  the  problem  of  minimizing  the  least  stable  eigenvalue  of  the  PDE  (8.26)  is 
converted  to  the  following  optimization  problem: 

min  Ai.  (8-41) 

{ka(x)=0,fgn  \km (x)\2 dx=l} 
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Even  with  km{x )  alone,  the  optimization  of  a  non-symmetric  eigenvalue  prob¬ 
lem  (as  in  our  case)  is  challenging  with  limited  theory  for  guidance;  see  section  16 
of  the  review  paper  [172]  on  eigenvalue  optimization.  As  a  first  step,  we  relax  the 
optimization  problem  by  replacing  the  operator  L(e)  by  its  symmetric  component: 

T  i  T*  ,72 

Ls(e)V  =  (^—  )V  =  -  —  *4(*K  (8-42) 

where  L*  is  the  adjoint  of  L  and  k'm(x )  :=  ^-(x).  Let  A^  denote  the  principal 
eigenvalue  of  L* .  The  following  lemma  gives  the  relationship  between  Ai  and  A^. 

Lemma  8.6.1.  Let  \\  denote  the  principal  eigenvalue  of  the  operator  L  in  (8.27) 
and  A^  denote  the  principal  eigenvalue  of  the  symmetric  operator  Ls  in  (8.J2). 
Then 


Ai  <  A*. 


□ 


Proof.  Let  Ax  be  the  principal  eigenvalue  and  <f>{x)  be  the  corresponding  positive 
eigenfunction  of  the  non-symmetric  problem: 

2d2<f>  krn(x)  dd 


'  dx2 


+  6- 


Po  dx 


=  Ai0. 


Multiplying  by  (f  and  integrating  by  parts,  we  obtain 

r  27T  /  J  A.  \  2  , 


—a. 


0 


We  have 


A 1  <  max 

</»o 


f  —  )  dx  —  — —  f  k'  (x)(j)2dx  —  Ai  f  <f2dx. 
\dx  J  2p0  Jo  Jo 

al  f  (S  )2  dx  ~  277  f  k'miX)4>2dx 


f  <f2dx 


=  a;, 


where  the  last  equality  follows  from  the  variational  characterization  of  the  princi¬ 
pal  eigenvalue  for  a  symmetric  elliptic  problem.  ■ 


Instead  of  the  original  eigenvalue  optimization  problem  (8.41),  we  pose  and 
solve  the  following  simpler  optimization  problem: 

min  As  (8.43) 

f  |fcm|2£Zx=l  j 
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where  As  is  the  principal  (largest)  eigenvalue  of  Ls\ 


-  2^”  W  ^  (8'44) 
with  Dirichlet  boundary  conditions  (8.16).  Because  of  Lemma  8.6.1,  the  solution 
to  this  problem  provides  an  upper  bound  on  the  solution  to  (8.41).  Among  the  two 
roots  of  the  characteristic  equation  s 2  +  b^s  —  As  =  0,  the  one  closer  to  0  -  which 
we  denote  by  (s^)s  -  is  the  least  stable  eigenvalue  of  the  mistimed  symmetric 
PDE 


d2v 

d t2 


+  b , 


dv 

'dt 


L*(e)v. 


(8.45) 


In  the  symmetric  and  non-symmetric  cases  respectively, 

Xs  A 

(st)s  — >  —  as  N  — >  oo,  and  sA  — >  —  as  N  — >  oo. 

bo  b0 

From  Lemma  8.6.1,  we  get  (<s^)s  <  in  the  limit  of  large  N. 


The  calculations  leading  to  formulas  for  the  principal  eigenvalue,  eigenfunction, 
and  the  optimal  mistiming  gain  profile  for  the  symmetric  PDE  are  presented  in 
Section  8.9.  Figure  8.8  presents  the  optimum  mistiming  profiles  for  the  symmetric 
PDE  for  three  different  values  of  e.  Before  presenting  numerical  validation  of 
eigenvalues,  we  summarize  the  main  conclusions  of  the  optimization  calculations 
in  Section  8.9: 


1.  The  optimization  calculations  provide  a  rigorous  O(j^)  bound  on  the  least 
stable  eigenvalue  of  the  symmetric  problem  for  non- vanishing  values  of  mis¬ 
timing  amplitude  e.  Using  Lemma  8.6.1,  this  leads  to  an  O(j^)  bound  for 
the  least  stable  eigenvalue  of  the  non-symmetric  pde  and  hence  the  discrete 
platoon. 
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2.  The  results  of  the  optimization  calculation  are  also  shown  to  be  consistent 
with  the  results  of  perturbation  analysis.  In  particular,  the  optimal  mistim¬ 
ing  profile  tends  to  the  beneficial  sinusoidal  profile  (see  Corollary  8.6.1)  in 
the  limit  e  — »  0. 

3.  Using  a  symmetry  technique,  the  optimal  mistiming  profile  for  scenario  II 
is  shown  to  be  a  stretched  (by  factor  of  2)  version  of  the  optimal  mistiming 
profile  for  the  scenario  I  case.  A  formula  linking  the  two  is  given  which  helps 
generalize  the  results  of  perturbation  analysis  to  non-vanishing  values  of  e. 

Figure  8.9  shows  the  trend  of  the  least  stable  eigenvalue  of  the  symmetric 
PDE  with  the  optimum  mistiming  profile  k*n(x),  and  compares  it  with  the  least 
stable  eigenvalues  of  the  non-symmetric  PDE  (8.26)  under  both  sinusoidal  and 
k^ix)  mistiming.  It  is  seen  from  the  figure  that  with  the  mistiming  profile  h*n(x), 
the  least  stable  eigenvalue  for  the  non-symmetric  PDE  is  smaller  (more  to  the 
left)  than  its  symmetric  counterpart.  This  is  consistent  with  the  conclusion  of 
Lemma  8.6.1.  However,  numerically  the  optimal  mistiming  profile  obtained  for 
symmetric  PDE  was  found  to  be  sub-optimal  for  the  non-symmetric  PDE  corre¬ 
sponding  to  the  discrete  platoon.  In  particular,  for  the  values  of  e  tested  and  shown 
in  the  figures,  the  sinusoidal  mistiming  profile  was  seen  to  provide  greater  damp¬ 
ing  for  the  discrete  platoon.  The  numerically  computed  least  stable  eigenvalue  of 
the  symmetric  PDE  with  the  optimal  mistiming  profile  matches  the  formula  for 
the  same  derived  in  the  appendix  and  it  approaches  0  as  O(^);  see  Figure  8.9. 
In  all  cases,  eigenvalues  of  the  PDE  (8.26)  closely  matched  the  discrete  platoon 
eigenvalues. 
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Figure  8.8.  The  optimal  mistiming  pattern  k^x)  for  the  symmetric  PDE  com¬ 
puted  according  to  the  procedure  laid  out  in  section  8.6.2  for  three  different  values 
of  e.  The  parameters  are  N  =  50  and  ko  —  1- 


Figure  8.9.  The  least  stable  eigenvalue  when  the  optimum  mistuing  profile  com¬ 
puted  from  the  eigenvalue  optimization  is  implemented  with  e  =  0.2 y/ir.  For 
comparison,  several  other  eigenvalue  calculations  are  shown.  PDE  (8.45)  is  the 
symmetric  PDE. 
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8.7  Simulations 


We  now  present  results  of  a  few  simulations  that  show  the  time-domain  im¬ 
provements  -  manifested  in  faster  decay  of  initial  errors  -  with  the  mistuning-based 
design  of  control  gains.  Simulations  were  carried  out  for  a  platoon  of  N  =  20  ve¬ 
hicles  with  scenario  I,  i.e.,  with  fictitious  lead  and  follow  vehicles.  The  desired  gap 
was  A  =  1  and  desired  velocity  was  Vd  =  5.  The  initial  velocity  of  every  vehicle 
was  chosen  as  the  desired  velocity  and  the  initial  position  of  the  ith  vehicle  was 
chosen  as  Zj( 0)  =  i A  —  0.5  for  i  =  {1, . . . ,  N}. 

Figure  8.10  depicts  the  time-histories  of  the  absolute  and  relative  position 
errors  of  the  individual  vehicles  with  a  symmetric  bidirectional  control,  where  the 
control  gains  were  chosen  as  k\^  =  kf'1  =  1  and  b,  =0.5  for  i  =  {1, . . . ,  N}.  The 
position  errors  shown  are  un-normalized,  i.e.,  absolute  position  error  of  the  ith 
vehicle  is  —  Zf  and  the  relative  position  error  is  Zl_\  —  Z{  —  A. 

Figure  8.11  depicts  the  time-histories  of  the  absolute  and  relative  position 
errors  for  the  platoon  with  mistimed  controller  gains.  The  mistiming  profile  was 
chosen  according  to  Corollary  8.6.1  so  that  maximum  and  minimum  gains  over  all 
vehicles  is  within  ±10%  of  the  nominal  value.  On  comparing  Figures  8.10  and  8.11, 
we  see  that  the  errors  in  the  initial  conditions  are  reduced  faster  in  the  mistimed 
case  compared  to  the  nominal  case.  These  observations  are  consistent  with  the 
improvement  in  the  closed-loop  stability  margin  with  the  mistimed  design. 


(a)  Absolute  position  errors 


(b)  Relative  position  errors 

Figure  8.10.  Simulations  with  symmetric  bidirectional  control.  Time  histories  of 
the  absolute  and  relative  position  errors  of  the  vehicles  in  a  platoon  with  symmetric 
bidirectional  control  (scenario  1)  are  shown. 
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(a)  Absolute  position  errors. 
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(b)  Relative  position  errors. 

Figure  8.11.  Simulations  with  mistimed  bidirectional  control.  Time  histories  of 
the  absolute  and  relative  position  errors  of  the  vehicles  in  a  platoon  (scenario  I) 
with  mistimed  bidirectional  control  are  shown.  Controller  gains  were  chosen  as 
k\^  =  1  —  0.1  sin (yf),  kf  )  =  1  +  0.1  sin (yf),  where  yf  defined  in  (8.2). 


290 


8.8  Comments  and  open  problems 


Although  the  model  is  derived  under  the  assumption  of  large  number  of  vehi¬ 
cles  N,  in  practice  it  provides  quantitatively  correct  predictions  for  the  discrete 
platoon  dynamics  even  for  relatively  small  values  of  N.  The  advantage  of  the 
PDE  formulation  is  reflected  in  the  ease  with  which  the  spectrum  can  be  obtained 
even  with  non-symmetric  boundary  conditions.  Finally,  certain  important  aspects 
such  as  the  beneficial  nature  of  forward-backward  asymmetry  in  control  gains  as 
revealed  by  the  PDE  is  difficult  to  see  with  the  discrete  platoon  model. 

A  promising  research  direction  is  exploring  the  use  of  PDE-based  models  for 
design  and  analysis  of  decentralized  controllers  for  a  fleet  of  vehicles  in  2  or  3  spa¬ 
tial  dimensions.  Another  promising  direction  is  the  study  of  formations  with  time- 
varying  topology  through  a  continuum  model.  Multi-agent  coordination  problems 
in  which  the  interconnections  between  agents  may  change  with  time  are  difficult  to 
analyze  due  to  their  time-varying  nature.  Perhaps  an  “aggregate”  view  afforded 
by  a  continuum  approximation  -  in  the  form  of  an  appropriate  PDE  -  can  be 
useful  in  analysis.  However,  how  to  derive  the  governing  PDE  for  such  a  situation 
and  how  to  validate  it  is  not  clear. 

We  did  not  investigate  in  this  chapter  if  the  mistiming  based  design  reduces 
the  amplification  of  disturbance  that  is  typically  seen  in  automated  platoons.  In 
light  of  the  results  in  Chapter  7  this  is  an  important  problem  and  needs  to  be 
studied  in  the  future. 
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8.9  Proofs 


In  this  section,  we  present  the  solution  of  the  eigenvalue  optimization  prob¬ 
lem  (8.43).  The  optimization  is  based  upon  a  method  originally  due  to  Keller  [173]. 
Assume  k*n(x)  minimizes  the  largest  eigenvalue  and  is  thus  the  solution.  After 
Keller,  we  introduce  a  family  of  functions  km(x,S )  with  k*m(x )  =  km(x,  0)  to  con¬ 
struct  a  differential  characterization  of  this  optimal.  For  each  6,  the  principal 
eigenvalue  and  the  eigenfunction  are  given  by  As(<5)  and  J>(x,5)  respectively.  Dif¬ 
ferentiating  (8.44)  with  respect  to  S  and  evaluating  at  4  =  0  gives 

4%  -  =  A 

where  (k'm)g(x)  =  d(k™^x)  |g=0  and  4>s  =  %\s=o-  Multiplying  by  0,  integrating,  and 
using  (8.44)  gives 


-2tt 


/  (k,m)s(x)ft2dx  =  0 

Jo 

On  differentiating  the  constraint  (8.40),  we  obtain 


(8.46) 


r*2-7r 


km(x)(km)s(x)dx  =  0. 


(8.47) 


Since  (km)g(x)  represents  an  arbitrary  perturbation  about  the  optimal,  the  two 
equations  (8.46)- (8.47)  imply  that  the  optimal  mistiming  pattern  is  given  by 

d(j) 


k*m(x)  =  -C<f> 


dx  ’ 


(8.48) 


where  C  is  some  constant.  It  follows  that 


k'mix)  = 


Cd2((j)f 

2  dx2 


and  substituting  this  in  (8.44),  one  obtains  a  nonlinear  BYP 


eC 


*00"  +  7 —  {jfrft'  +  (ft)2]  0  —  As0. 
0 


(8.49) 
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with  Dirichlet  boundary  conditions  (8.16).  The  principal  eigenfunction  of  this 
problem  then  defines  the  optimal  mistiming  profile  k*m(x)  by  (8.48). 

Before  presenting  the  details  of  the  calculations,  we  make  two  observations. 
One,  the  nonlinear  BVP  admits  a  symmetry  whereby  if  <f>(x)  is  an  eigenfunction 
then  so  is  (p(2n  —  x ).  Now,  the  principal  eigenfunction  of  any  elliptic  eigenvalue 
problem  is  known  to  be  unique  (see  Ch.  15  of  [174]).  This  implies  that  <p(x)  = 
4>{ 2tt  —  x),  and  at  x  =  n 

^W=°.  (8.50) 

Thus,  the  solution  (f>(x)  of  (8.49)  also  provides  for  the  principal  eigenfunction  with 
the  Neumann-Dirichlet  boundary  conditions  (8.16).  It  is  given  by  </>( |  +  n)  and 
the  optimal  profile  is  obtained  as  before  by  using  (8.48). 

The  second  observation  pertains  to  a  comparison  with  the  results  obtained 
using  perturbation  methods.  To  do  so,  we  consider  the  e  — >  0  limit  first.  In  this 
limit,  the  principal  eigenfunction  (of  (8.49))  is  given  by  <f>  =  sin(|).  Using  (8.48), 
one  obtains  the  optimal  mistiming  pattern  for  the  limiting  case 


k*m(x)  =  -C'sin(x), 


where  C  —  4  satisfies  the  norm  constraint.  This  is  consistent  with  the  optimal 
mistiming  profile  obtained  using  perturbation  methods.  For  small  e,  this  also 
provides  an  estimate  of  the  eigenvalue 


A  -_T 


16vrp0 


+  0(e2), 


which  using  (8.39)  yields  the  result  of  Corollary  8.6.1.  Finally  using  the  sym¬ 
metry  arguments,  the  principal  eigenfunction  for  the  Neumann-Dirichlet  case  is 
given  by  is  given  by  </>( f  +  7 r)  =  cos(|)  and  using  (8.48),  the  optimal  mistiming 
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profile  is  k*n(x)  =  C sin ( |) .  This  too  is  consistent  with  the  results  obtained  using 
perturbation  methods  (see  Corollary  8.6.2). 


In  order  to  compute  the  optimal  mistiming  profile,  the  ODE  (8.49)  is  first 
simplified  to 


*  - 

that  on  integration  gives 

eC 


«§ + i-y 


d(f), 


Xs - d>i  = 


D 


2p«ri  «§  +  H^2’ 


(8.51) 


where  D  is  a  constant  of  integration.  Using  (8.50),  we  get  D  —  Xs  +  ^y^j 
where  y0  :=  <f>{i r).  As  a  result,  (8.51)  becomes 

eC  ,d(t>,2\ 


eC  2/  \ 

1  +  « - 2^  (X) 

2p0«0 


2 p0Xs  dx ' 


eC  ' 

— 

[  +  2p„„f  »J 

where  y0  =  <f>{ n).  After  some  manipulation,  this  equation  yields  the  integral 


a  o 


r<f>(x) 

r  i  _i_  £C 

2po<lo  d 

L 

1 

OtO 

1 

to 

1 _ 

dy  =  x. 


(8.52) 


The  solution  to  this  integral  requires  Elliptic  functions  of  the  second  kind.  In 
particular,  we  propose  a  co-ordinate  change 


0  =  2/Os  in(|),  0G[  0,2tt]. 


(8.53) 


and  define  (3  :=  Cy q.  Using  (8.52),  6  is  a  solution  to  an  implicit  Elliptic  integral 
equation 


where  substituting  6  =  tt  and  using  (8.53),  one  obtains  an  implicit  relationship 
between  (3  and  Xs: 


1  + 


e/3  .  2  9 
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2poao 


5  2 
c it)  = - x, 
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(8.54) 
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(8.55) 


294 


In  the  9  co-ordinate, 


d<p 

dx 


Vo  ,0 
Ycos(-) 


dO  2/0  V-A8  ,9 

—  = - cos(-) 

dx  a0  2 


\  ,  e/5  .  2  9 

1  H - 77  Sin  - 

2p0«o  2 


(8.56) 


where  d,9/dx  is  obtained  from  (8.54).  Using  (8.48),  (8.53),  (8.55)  and  (8.56),  we 
get  the  optimal  pattern  in  the  9  co-ordinate  as 

1 
9 


KM  =  -Ml3)  sin(0) 


1  H - ^-77  sin2  - 

2 PoOq  2 


(8.57) 


where 


m  = 
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8n 


*2 tx  r 
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Now  use  the  constraint  A  Ktdx  =  1  (which  implies  A  km(9)2(dx/d9)d9  =  1)  to 


deduce  the  unknown  constant  f3: 
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The  eigenvalue  As  is  then  obtained  from  (8.55),  6(x)  is  obtained  from  (8.54), 
and  the  optimal  mistiming  pattern  in  thex-coordinate,  k*m(x),  is  deduced  after 
substituting  9{x)  in  (8.57).  Figure  8.8  depicts  a  few  typical  optimal  mistiming 
patterns  for  different  values  of  e.  Consistent  with  the  results  obtained  using  the 
perturbation  method,  the  optimal  mistiming  pattern  is  close  to  the  sinusoidal 
pattern  for  small  values  of  e.  Finally  using  the  eigenvalue  formula  (8.55)  and 
doing  a  little  reduction,  one  also  sees  that  the  least  stable  eigenvalue  approaches 
0  as  O(jj)  for  e  >  0.  Numerical  evaluations  of  this  exact  formula  appear  in 
Figure  8.9. 
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Chapter  9 


Summary 


The  estimation  and  control  problems  examined  in  this  dissertation  share  the 
common  attribute  of  being  defined  over  graphs.  The  other  common  features  are 
that  only  relative  measurements  are  available.  The  underlying  theme  of  our  in¬ 
vestigations  is  the  effect  of  interconnect  structure  in  large-scale  systems.  An  im¬ 
portant  lesson  learned  from  the  results  described  here  is  that  interconnection 
topology  dictates,  to  a  large  extent,  the  achievable  performance.  Moreover,  the 
matrix- valued  effective  resistance  -  introduced  in  this  dissertation  -  has  proven  to 
be  useful  in  analyzing  the  scalability  of  both  estimation  and  control  algorithms. 
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